Compare commits

...

8692 Commits

Author SHA1 Message Date
Juan A. Suarez Romero
2665d012a8 radv: automake: include radv_extensions.py in the tarball
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-23 12:37:01 +02:00
Bas Nieuwenhuizen
a548b727a1 ac/nir: Only clamp shadow reference on radeonsi.
Vulkan CTS does not expect the value to be clamped (at least for D32),
and it makes a differences even though depth is in [0,1], due
to strict inequalities.

I couldn't find anything in the Vulkan spec about this, but the test
seemed to be copied from GL tests and the GL spec only specifies
clamping for fixed point formats. Hence I expect radeonsi to run into
this at some point as well, but given that they still have a usecase
with the Z16->Z32 promotion, I'll leave that for someone else to clean
up.

This at least fixes radv dEQP-VK.texture.shadow.* on VI.

Fixes: 0f9e32519b 'ac/nir: clamp shadow texture comparison value on VI'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen
c07d719e8b radv: Disallow indirect outputs for GS on GFX9 as well.
Since it also uses the output vector before writing to memory.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-23 00:27:44 +02:00
Bas Nieuwenhuizen
2c5b43c87f ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 00:27:44 +02:00
Dave Airlie
da9c3cd3ee radv/ac/nir: only emit tess factors to storage if tes reads them
Otherwise we just need to write them to the tf ring.

this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen
6ce550453f radv: Don't use vgpr indexing for outputs on GFX9.
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-22 02:36:37 +02:00
Bas Nieuwenhuizen
ad727b96b6 ac/nir: Account for compact array index in GS input load from LDS.
Mirrors the vram path.

Fixes: d4ecc3c929 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen
67648c0faa radv: Don't compile shaders when they are cached already.
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.

Fixes: ce03c119ce 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:34 +02:00
Bas Nieuwenhuizen
3bf954b28e radv: Don't check for max GL GS invocations.
We specify 127 instead of 32 as the limit in vulkan.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:09 +02:00
Bas Nieuwenhuizen
050f7e2df2 radv: Don't explicitly reference vertex shader for draw_id.
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.

Fixes: 75dfab24a2 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:22 +02:00
Bas Nieuwenhuizen
20fb15bfe4 radv: Don't reset cmd_buffer->state.dirty.
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:16 +02:00
Bas Nieuwenhuizen
fb55477990 radv: Correctly detect changed shaders for vertex descriptors.
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 19:59:44 +02:00
Bas Nieuwenhuizen
24fe4e6143 ac/nir: Set larged wrokgroup size for GS on GFX9.
They don't take a single wave anymore and we need the barriers.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen
9e82f2b3ea ac/nir: Take the max workgroup size of all provided shaders.
Fixes: ffaf4d608a 'radv: Enable tessellation shaders for  GFX9.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:28 +02:00
Alex Smith
0fdd531457 radv: Fix pipeline cache locking issues
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.

This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 03:52:43 +02:00
Lionel Landwerlin
c71d44c7f8 anv: don't assert on device init on Cannonlake
v2: Warn that support is still in alpha (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-21 02:37:33 +01:00
Lionel Landwerlin
0c95adaf9e anv: disable stencil pma fix on Gen > 9
This workaround isn't listed on Gen10.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-21 02:37:33 +01:00
Lionel Landwerlin
0c92651a3b blorp: enable R32G32B32X32 blorp ccs copies
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-21 02:37:33 +01:00
Eric Anholt
48615d1ead meson: Fix vc5 deps on the XML-generated headers.
I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the
v3d_packet_v33_pack.h that the compiler and QPU packing actually use.
2017-10-20 17:16:00 -07:00
Eric Anholt
07bfdb478b broadcom/vc5: Propagate vc4 aliasing fix to vc5.
See e5fea0d621
2017-10-20 17:09:47 -07:00
Stefan Schake
e5fea0d621 broadcom/vc4: Fix aliasing issue
This was causing Android clang version 3.8.256229 to miscompile,
presumably due to strict aliasing.

Fixes: 14dc281c13 ("vc4: Enforce one-uniform-per-instruction after optimization.")
2017-10-20 17:09:35 -07:00
Dylan Baker
035ec7a2bb meson: Add support for EGL glvnd
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2017-10-20 16:46:48 -07:00
Dylan Baker
108d257a16 meson: build libEGL
This is based heavily on Daniel Stone's work for the same, rebased on
master and with a number of TODO's fixed.

This does not implement glvnd (which is coming in a later patch)

Meson builds egl slightly differently than autotools, namely it doesn't
build an intermediate shared library. It doesn't do this because meson
doesn't have problems with the name of the library being dynamically
generated, so the glvnd and non-glvnd code can follow the same path.

v2: - Don't reuse variable (Eric E.)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-10-20 16:46:48 -07:00
Dylan Baker
ddf06a05ad meson: move wayland_drm_protocol generation to wayland-drm
These files are needed by both vulkan wayland-wsi and by egl
wayland-wsi, since the XML file is in src/egl/wayland/wayland-drm and we
can include this directory in such a way that it will be loaded before
egl and vulkan this allows us to avoid multiple calls to the same
generator.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>
2017-10-20 16:46:48 -07:00
Dylan Baker
8d3b1210cb meson: Don't allow glx to be built without platform_x11
Previously this failed to change with_glx to disabled from auto if
platform_x11 was unset or if no opengl apis were being built.

v2: - swap conditional positions

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>
2017-10-20 16:46:48 -07:00
Dylan Baker
8792a9e01b meson: bump libdrm_amdgpu requirement to 2.4.85
fixes: b603725703 ("configure.ac: Bump libdrm_amdgpu version to 2.4.85.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 16:45:39 -07:00
Eric Anholt
5a0d3e1129 nir: Print the components referenced for split or packed shader in/outs.
Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative,
much better to see:

decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0)
decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0)

v2: Handle arrays and structs better (by Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-20 16:26:46 -07:00
Eric Anholt
d9ce4ac990 nir: Add a safety check that we don't remove dead I/O vars after lowering.
The pass only looks at var load/store intrinsics, not input load/store
intrinsics, so assert that we don't see the other type.

v2: Adjust comment indentation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-20 16:26:07 -07:00
Andres Rodriguez
a2c6fbb3ee radv: disable implicit sync for radv allocated bos v3
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.

This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.

v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:54 +02:00
Andres Rodriguez
eff2bdbd82 radv: factor out radv_alloc_memory
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:49 +02:00
Andres Rodriguez
92724338ba radv: Expose VK_EXT_global_priority
Expose the extension string as supported

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez
9f7edf4d1f radv: don't skip PS/VS partial flush
This patch helps lower high priority compute latency. Found by
bisecting a perf regression on computeparticles with high priority
compute queues enabled.

Reverting this micro-optimization doesn't seem to have any negative
effect on performance on Dota2 or ssao.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez
fd04f3eb86 radv: Implement VK_EXT_global_priority
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez
557de3b9ae radeonsi: hardcode shader WAVE_LIMIT to the maximum value
This is part of a cooperative scheduling approach used by radv. All
drivers in the stack must opt-in to resource arbitration, otherwise GL
based apps will be able to ignore system priorities.

We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez
986c4b0bd4 radv: hardcode shader WAVE_LIMIT to the maximum value
When WAVE_LIMIT is set, a submission will opt-in for SPI based resource
scheduling. Because this mechanism is cooperative, we must ensure that
all submissions have this field set, otherwise they will bypass resource
arbitration.

We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez
b7c2f70656 vulkan: update headers & registry to VK 1.0.63
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Bas Nieuwenhuizen
b603725703 configure.ac: Bump libdrm_amdgpu version to 2.4.85.
For VK_EXT_global_priority in radv.

Acked-by: Andres Rodriguez <andresx7@gmail.com>
2017-10-21 01:01:44 +02:00
Eric Anholt
9b5fa214f4 broadcom/vc5: Use SETMSF to handle discards.
A bit of spec text suggested that (like vc4) condition codes should be
used for discards, and the simulator was fine with it, but the 7268
disagrees and you have to use SETMSF instead or the color comes through.
Fixes glsl-fs-discard-01 and many of the interpolation-with-clipping
tests.
2017-10-20 15:59:41 -07:00
Eric Anholt
a48a38937c broadcom/vc5: Set the snorm/unorm packing functions to be lowered.
We don't have native instructions for them, so set up the lowering.  Once
we support the bfi instructions that get generated, they should start
actually working.
2017-10-20 15:59:41 -07:00
Eric Anholt
0e6fee7328 broadcom/vc5: Fix pasteo that broke vertex texturing.
We weren't ever filling in the texture state record, so we'd dereference
NULL from the shader.
2017-10-20 15:59:41 -07:00
Eric Anholt
34690536a7 broadcom/vc5: Move default attribute value setup to the CSO and fix them.
I was generating some stub values to bring the driver up, but fill them in
properly now.  We now set 1.0 or 1u as appropriate, and thanks to being in
their own BO it fixes piglit failures on the 7268 (where our 4-byte
alignment was insufficient).

Fixes const-packHalf2x16.shader_test
2017-10-20 15:59:41 -07:00
Eric Anholt
fb15168919 broadcom/vc5: Move most of the shader state attribute record to the CSO.
This should reduce our draw-time overhead, and puts the code where it
should go long term.
2017-10-20 15:53:55 -07:00
Eric Anholt
f4ff8f74ee broadcom/vc5: Fix build failure frm nir_shader::stage removal.
Fixes: 59fb59ad54 ("nir: Get rid of nir_shader::stage")
2017-10-20 15:53:55 -07:00
Matt Turner
9cd60fce9c i965/fs: Use align1 mode on ternary instructions on Gen10+
Align1 mode offers some nice features over align16, like access to more
data types and the ability to use a 16-bit immediate. This patch does
not start using any new features. It just emits ternary instructions in
align1 mode.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
8c16c9c677 i965: Add align1 ternary instruction emission support
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
f11fa5ac6c i965: Add align1 ternary instruction disassembler support
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
6c7fc9b73a i965: Add align1 ternary instruction-word support
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
3b2c868848 i965: Add align1 ternary instruction support to conversion functions
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
281e8b8f27 i965: Add align1 ternary instruction field encodings
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
5f6ee55e68 i965: Add functions to abstract access to 3src register types
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:17 -07:00
Matt Turner
e15dac319b i965: Rename brw_inst's functions that access the 3src register type
Put hw_ in the name so that it's clear these are the hardware encodings.

Similar to commit 9fb8323328 ("i965: Rename brw_inst's functions that
access the register type")

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:16 -07:00
Matt Turner
e7f3b82e03 i965: Rename brw_inst 3src functions in preparation for align1
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:16 -07:00
Matt Turner
ba50b538af i965: Print subreg in units of type-size on ternary instructions
The instruction word contains SubRegNum[4:2] so it's in units of dwords
(hence the * 4 to get it in terms of bytes). Before this patch, the
subreg would have been wrong for DF arguments.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:16 -07:00
Matt Turner
3f14150e9a i965: Add functions for brw_reg_type <-> hw 3src type
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:16 -07:00
Matt Turner
4c857d1f3b i965: Move brw_reg_type_is_floating_point to brw_reg_type.h
I'm going to call this from brw_inst.h, and I don't want to have to
include all of brw_reg.h.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-20 15:00:16 -07:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Samuel Pitoiset
341529dbee radv: use optimal packet order for draws
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset
af6985b309 radv: add radv_emit_shaders_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset
0d85f4a9e2 radv: add radv_emit_shader_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Marek Olšák
46f452dd5f st/mesa: correct a u_vbuf comment
trivial.
2017-10-20 18:56:20 +02:00
Christian Gmeiner
65ccee2dc2 etnaviv: fix implicit conversion warning
Galliums query_type used in APIs is unsigned.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:55 +02:00
Christian Gmeiner
57a586828f etnaviv: enable occlusion query if GPU supports it
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:48 +02:00
Christian Gmeiner
246243d447 etnaviv: add support for occlusion queries
Passes most occlusion query piglits. The following piglits are broken:
- spec@arb_occlusion_query@occlusion_query_meta_fragments
- spec@arb_occlusion_query@occlusion_query_meta_save
- spec@arb_occlusion_query2@render

v1 -> v2:
 - use one sample provider for all occlusion queries tyes
 - add comment about 'magic' value 0x1DF5E76

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:44 +02:00
Christian Gmeiner
282d8698ec etnaviv: add basic infrastructure for hw queries
No hardware query is supported yet.

v1 -> v2
 - removed query_type from strcut etna_hw_sample_provider

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:40 +02:00
Christian Gmeiner
b8c335c91b etnaviv: update headers from rnndb
Update to etna_viv commit 6c9c706.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-20 12:42:35 +02:00
Chris Wilson
aa65dcd1d7 relnotes/17.3: EGL_IMG_context_priority is now implemented
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:18 +01:00
Chris Wilson
f72392231b i965: Report supported context priorities to EGL/DRI
Hook up the RendererQuery for __DRI2_RENDERER_HAS_CONTEXT_PRIORITY to
report the available DRM_I915_GEM_CONTEXT_SETPARAM options based on the
per-client default context. The kernel will validate the request to change
the property, so we get an accurate reflection of available support
(based on kernel version and privilege) and we should only have to do it
once during screen setup -- although the SETPARAM should be fast, they
are still an ioctl each.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Chris Wilson
1617fca6d1 i965: Pass the EGL/DRI context priority through to the kernel
Decode the EGL/DRI priority enum into the [-1023, 1023] range as
interpreted by the kernel and call DRM_I915_GEM_CONTEXT_SETPARAM to
adjust the priority. We use 0 as the default medium priority (also the
kernel default) and so only need adjust up or down. By only doing the
adjustment if not setting to medium, we can faithfully report any error
whilst setting without worrying about kernel version.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Chris Wilson
21023954f8 i965: Record the presence of the kernel scheduler
Mention to the debug log if the kernel scheduler is enabled; and in
particular if it has preemption enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Chris Wilson
98c2b7f9fa i965: Sync i915_drm.h from kernel for IMG_context_priority
Pulling in changes up to

    kernel commit ac14fbd460d0ec16e7750e40dcd8199b0ff83d0a
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Tue Oct 3 21:34:53 2017 +0100

	drm/i915/scheduler: Support user-defined priorities

and including the fixup from

    kernel commit 822a4b673284672af697ccd66e8795f8a712a90d
    Author: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Date:   Fri Oct 6 13:45:59 2017 +0300

	drm/i915: Don't use BIT() in UAPI section

for implementing IMG_context_priority.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Chris Wilson
5c5618338a egl,dri: Propagate context priority hint to driver->CreateContext
Jump through the layers of abstraction between egl and dri in order to
feed the context priority attribute through to the backend. This
requires us to read the value from the base _egl_context, convert it to
a DRI attribute, parse it again in the generic context creator before
passing it to the driver as a function parameter.

In order to not require us to pass back the actual value of the context
priority after creation, we impose that drivers should report the
available set of priorities during screen setup (and then they may chose
to fail if given an invalid value as that should have been checked at
the user boundary.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net> # i915/i965
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Chris Wilson
95ecf3df62 egl: Support IMG_context_priority
IMG_context_priority
https://www.khronos.org/registry/egl/extensions/IMG/EGL_IMG_context_priority.txt

    "This extension allows an EGLContext to be created with a priority
    hint. It is possible that an implementation will not honour the
    hint, especially if there are constraints on the number of high
    priority contexts available in the system, or system policy limits
    access to high priority contexts to appropriate system privilege
    level. A query is provided to find the real priority level assigned
    to the context after creation."

The extension adds a new eglCreateContext attribute for choosing a
priority hint. This stub parses the attribute and copies into the base
struct _egl_context, and hooks up the query similarly.

Since the attribute is purely a hint, I have no qualms about the lack of
implementation before reporting back the value the user gave!

v2: Remember to set the default ContextPriority value to medium.
v3: Use the driRendererQuery interface to probe the backend for
supported priority values and use those to mask the EGL interface.
v4: Treat the priority attrib as a hint and gracefully mask any requests
not supported by the driver, the EGLContext will remain at medium
priority.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-20 11:28:17 +01:00
Fredrik Höglund
e2053b8e3d radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
The Vulkan specification says:

   "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
    PIPE_BIT in the source stage mask will effectively not wait for
    any prior commands to complete."

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:37:51 +02:00
Samuel Pitoiset
565c22158f radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor}
Fixes two compilation warnings in release build. Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:22:19 +02:00
Samuel Pitoiset
c8f2b73656 radv: rename radv_cmd_buffer_flush_state() to radv_draw()
Similar to the dispatch codepath.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:16 +02:00
Samuel Pitoiset
9e45e5c9fd radv: emit primitive restart from radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:14 +02:00
Samuel Pitoiset
93207a8e89 radv: add radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:12 +02:00
Samuel Pitoiset
9466856456 radv: refactor indirect draws (+count buffer) with radv_draw_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:11 +02:00
Samuel Pitoiset
75dfab24a2 radv: refactor indirect draws with radv_draw_info
Indirect draws with a count buffer will be refactored in a
separate patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:08 +02:00
Samuel Pitoiset
03afa95d9f radv: refactor simple and indexed draws with radv_draw_info
Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:05 +02:00
Samuel Pitoiset
54fa635f82 radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it
Only on CIK and later. We should only update VGT_INDEX_TYPE but
it seems easier to re-emit all the index buffer packets.

Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:40:01 +02:00
Samuel Pitoiset
eae46f192e radv: clear the dirty flags in the corresponding emit helpers
This will allow us to fix the VGT_INDEX_TYPE issue properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:28 +02:00
Samuel Pitoiset
68cd3564a0 radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFER
To be consistent with the emit function name.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:26 +02:00
Samuel Pitoiset
94e69f4141 radv: move DB_COUNT_CONTROL initialization to si_emit_config()
CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:38:11 +02:00
Samuel Iglesias Gonsálvez
9e515cf381 i965/vec4: remove setting default LOD in the backend
It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-20 08:29:53 +02:00
Samuel Iglesias Gonsálvez
c6d7d09bd0 i965/fs: remove setting default LOD in the backend
It is already done in NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-20 08:29:53 +02:00
Samuel Iglesias Gonsálvez
e382890e25 nir: set default lod to texture opcodes that needed it but don't provide it
v2:
- Use helper to add a new source to the texture instruction.

v3:
- Use nir_tex_instr_src_index() to simplify the patch (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-20 08:29:09 +02:00
Bas Nieuwenhuizen
6bc42855f9 radv: enable GS on GFX9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 07:14:00 +01:00
Bas Nieuwenhuizen
73749caf0e radv: calculate and emit GFX9 GS registers to pipeline state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:47 +01:00
Bas Nieuwenhuizen
9961ae2447 ac/nir: Fix up GS input vgprs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen
d4ecc3c929 ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen
ec53e52742 ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen
3e77333030 ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Bas Nieuwenhuizen
f82797b56d radv: Only emit TES when it exists.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:14:14 +01:00
Bas Nieuwenhuizen
6e21b7a294 radv: Use control shader presence for detecting tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Dave Airlie
5bc5e07d81 radv: fixup tess eval shader when combined.
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.

This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Bas Nieuwenhuizen
e6acc20b6a radv: Set VGT_GS_MODE properly for gfx9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 05:55:11 +01:00
Dave Airlie
99281c1e8f radv: ensure correct outinfo is picked.
This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 14:44:09 +10:00
George Kyriazis
f9d239e11f swr: Rework scratch space allocation
Remove allocation of > 2kbyte buffers into context memory in
swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers
and shader constants to a scratch space to be used by the upcoming draw.)

Large shader constant allocations need to be done in the circular scratch
buffer instead of context memory, because their values persist across
render calls.

Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger
buffers will get too large for the circular scratch space.

Fixes render issues with CEI Ensight.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 20:18:09 -05:00
Bas Nieuwenhuizen
ffaf4d608a radv: Enable tessellation shaders for GFX9.
It mostly works now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:43 +02:00
Dave Airlie
1dda214d9c ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Dave Airlie
14978a1c3b radv: drop unused r600_htile_info.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 00:38:57 +01:00
Dave Airlie
c8eb3558cc radv: fix CLEAR_STATE packet length.
Looking at shader traces I noticed some registers were missing,
one of them was being eaten by the wrong clear state length.

Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-19 23:56:48 +01:00
Dylan Baker
a447f9fe7b meson: don't build gallium dri target if gallium is disabled
Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and
the megadriver install script to attempt to install drivers without any
actual drivers being built.

fixes: 66f97f6640 ("meson: build radeonsi")
Reported-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
2017-10-19 15:17:34 -07:00
Timothy Arceri
087e010b2b radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps

V2: continue lowering local indirects due to llvm deficiencies.

Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 08:01:26 +11:00
Timothy Arceri
5549b47d7b radv: stop redundant setting of active_stages
We already set it when above in the nir compilation loop.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Timothy Arceri
bebfeb7e1c ac: move some code out of loop in store_tcs_output()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen
228325f4b7 radv: Modify rsrc1/rsrc2 generation for merged tess.
No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at
a different offset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:44 +02:00
Bas Nieuwenhuizen
8250efb90a radv: Set correct registers for merged shader rings.
We need different regs to end up in s0/s1.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:39 +02:00
Bas Nieuwenhuizen
6a074f87be radv: Add GFX9 HS emitting code.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:34 +02:00
Bas Nieuwenhuizen
b096245030 radv: Remove remaining hard coded references to VS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:31 +02:00
Bas Nieuwenhuizen
91b033f4f6 radv: Update GFX9 user data regs for GS/tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:27 +02:00
Bas Nieuwenhuizen
ce03c119ce radv: Add code to compile merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen
640f2c458f ac/nir: Add LS-HS input VGPR workaround.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:19 +02:00
Bas Nieuwenhuizen
0a182e73d9 ac/nir: Compile the bodies of multiple shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:15 +02:00
Bas Nieuwenhuizen
56d8af1ec5 ac/nir: Expand user SGPR descriptions a bit.
To prevent VS/TCS collisions in merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:07 +02:00
Bas Nieuwenhuizen
25efef40d2 ac/nir: Don't write to the dynamic HS word on GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:04 +02:00
Bas Nieuwenhuizen
d8bd693d03 ac/nir: Add function creation for merged LS+HS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:00 +02:00
Bas Nieuwenhuizen
0cdc8b26f8 ac/nir: Make scan_shader_output_decl less dependent on the context.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:56 +02:00
Bas Nieuwenhuizen
6078a3bd51 ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:51 +02:00
Bas Nieuwenhuizen
a996ed1f9b ac/nir: Change interface to allow multiple source shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:47 +02:00
Bas Nieuwenhuizen
872b21487c ac/nir: Add HS calling convention.
Needed for GFX9 merged shaders.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:42 +02:00
Bas Nieuwenhuizen
163a4bf386 ac: Parse the new HS RSRC1 register.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:20 +02:00
Tim Rowley
bfda35c8dd swr: knob overrides for Intel Xeon Phi
Architecture benefits from having more threads/work outstanding.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
028ffa5e18 swr/rast: Add api to override draws in flight
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO.

Patch by Jan Zielinski.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
2559f2b93e swr/rast: Widen fetch shader to SIMD16 (disabled for now)
Refactored the gather operation to process 16 elements at a time via
paired SIMD8 operations.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
49090ccf54 swr/rast: Change DS memory allocation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
04ea03d99d swr/rast: Fix indentation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
62e2d657c8 swr/rast: Miscellaneous viewport array code changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Tim Rowley
ed1db803fa swr/rast: Minor changes for os-x
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-19 13:10:55 -05:00
Kenneth Graunke
82144b7392 i965: Don't disable aux buffers for non-overlapping miplevels.
Meta's GenerateMipmap implementation binds the same image for both
sampling and rendering - but it samples from one miplevel while
rendering the next.  This is a false self-dependency, and there's
no need to disable auxiliary buffers in this case.  In fact, we really
want to leave it enabled so the new miplevels gain color compression.

Thankfully, the texture object's _MaxLevel is always one shy of the
miplevel being rendered.  So we can simply check if irb->mt_level is
overlaps with the texture's defined levels.  If not, there's no self-
dependency and we can leave the auxiliary buffers enabled.

Fixes a performance regression in GFXBench4 Car Chase, which apparently
calls glGenerateMipmap() on every frame.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
fa6ca6991b i965: Remove the intel_miptree_prepare_fb_fetch wrapper.
Now that intel_miptree_prepare_texture takes levels and layers, there's
not much use in this anymore.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
e208d7f874 i965: Only resolve texture levels/layers that are accessed.
This should avoid unnecessary resolves when working with texture views.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Kenneth Graunke
0954ce1000 i965: Make intel_miptree_prepare_texture() take level/layer arguments.
This effectively exports intel_miptree_prepare_texture_slices() as
intel_miptree_prepare_texture().  The hope is to avoid resolves for
when using texture views that access a subset of the levels/layers.

For now, we pass the same arguments to separate the mechanical change
from the one that actually modifies our behavior.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by; Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 11:10:00 -07:00
Tim Rowley
33bdbc1db4 gallium: add more exceptions to tgsi_util_get_inst_usage_mask
A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.

No regressions in llvmpipe piglit; fixes a large number of
swr regressions.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 12:49:32 -05:00
Kenneth Graunke
113a6a639f isl: Fix width check in isl_gen7_choose_msaa_layout.
The restriction is supposed to apply if the width *field* is >= 8192,
meaning the actual width *value* is >= 8193.

The code also incorrectly used == for some reason.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-19 10:21:45 -07:00
Kenneth Graunke
68f69ebdcc i965: Use is_scheduling_barrier instead of schedule_node::is_barrier.
Commit a73116ecc6 tried to make add_barrier_deps()
walk to the next barrier, and stop.  To accomplish that, it added an
is_barrier flag.  Unfortunately, this only works half of the time.

The issue is that add_barrier_deps() walks both backward (to the
previous barrier), and forward (to the next barrier).  It also sets
is_barrier.  Assuming that we're processing instructions in forward
order, this means that is_barrier will be set for previous instructions,
but not future ones.  So we'll never see it, and walk further than we
need to.

dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
now compiles its shaders in 3.6 seconds instead of 3.3 minutes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
2017-10-19 10:19:20 -07:00
Kenneth Graunke
3d112a7cd4 i965: Move fs_inst::has_side_effects()'s eot check to the parent class.
This eliminates a layer of wrapping, and makes a backend_instruction
sufficient.  The downside is that it exposes 'eot' to the vec4 backend,
which it doesn't need, but can basically happily ignore.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Pallavi G <pallavi.g@intel.com>
2017-10-19 10:19:20 -07:00
Roland Scheidegger
77b8392858 tgsi: fix tgsi_util_get_inst_usage_mask
The logic for handling shadow coords was completely broken.
Fixes be3ab867bd.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 16:33:39 +02:00
Emil Velikov
a6c55243b9 docs: update calendar, add news item and link release notes for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-19 13:31:59 +01:00
Emil Velikov
d5fdc37263 docs: add sha256 checksums for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit facc851818)
2017-10-19 13:31:59 +01:00
Emil Velikov
b1605550a6 docs: add release notes for 17.2.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 28dc4b64f2)
2017-10-19 13:31:59 +01:00
Iago Toral Quiroga
2d87caa279 glsl/linker: produce error when invalid explicit locations are used
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.

v2: compute location slots properly depending on shader stage and
    variable type / direction

Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-19 11:27:12 +02:00
Iago Toral Quiroga
16631ca30e i965/sbe: fix active components for SSO programs with over 16 inputs
When we have up to 16 FS inputs, the SF unit will reorder our inputs
to be consecutive, however, when we have more than 16 we need to
to read our inputs from the URB exactly as they have been
output from the previous stage. This means that for SSO we have to
consider if we have URB padding due to unused input locations.

Specifically, this affects gen9 active components programming, since
for things to work in scenarios with over 16 inputs that have padded
regions we need to ensure that we program active components for the
padded regions too. If we don't do this the hardware won't read
the URB properly for inputs located after padded regions.

Found empirically.

Fixes (these also require a patch in CTS):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-19 08:31:42 +02:00
Chris Wilson
b7c655f700 i965: Do not log a perf warning when mapping an idle bo
We only want to scare the user away from causing a GPU stall for mapping
a busy bo. The time taken to instantiate the set of pages for a buffer
and their mmapping is unavoidable and flagging idle bo as being busy is
"crying wolf".

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-19 07:12:39 +01:00
Matt Turner
e9796ebca7 i965: Use a union to bitcast a float
... which does not break C's aliasing rules.
2017-10-18 22:16:46 -07:00
Darren Salt
5767ce7d0d drirc: Group a few games in the glthread whitelist together.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 03:28:34 +02:00
Darren Salt
80c20b29d8 drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell).
“Saints Row: Gat out of Hell” benefits from this on slower CPUs in that
usage spikes on individual cores are avoided, which in turn makes it harder
to hit a bug which causes broken audio and the game to hang on exit.

“Saints Row IV” appears to be fine either way, but also exhibits the audio
breakage bug: glthread is therefore being enabled on the grounds that it should
make it a little harder to hit that bug.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-10-19 03:28:34 +02:00
Samuel Pitoiset
535aa43df0 radv: reset dirty flags after flushing all states
Move it to radv_cmd_buffer_flush_state() because if
rasterizerDiscardEnable is true, the flags are not cleared.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:48 +02:00
Samuel Pitoiset
966d66f28f radv: do not re-emit the index buffer for every draw call
It can only be changed when CmdBindIndexBuffer() is called
or when a secondary buffer is used. Though not always, but
let's re-emit the packets in this situation for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:43 +02:00
Samuel Pitoiset
e5480be0d1 radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet()
This saves few CPU cycles when CmdDrawIndexed() is used a lot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:30 +02:00
Bas Nieuwenhuizen
fa226e9933 radv: Do not read from the disk cache with RADV_DEBUG=nocache.
Otherwise the flag is borderline useless.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-18 20:37:10 +02:00
Alex Smith
2cccc74f56 radv: Set active_stages after getting cached shaders
Fixes: 7d45d22fdd ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 20:37:10 +02:00
Alex Smith
f557673237 radv: Don't free NIR shaders if tracing
Fixes a crash while generating a hang report.

Fixes: 7d45d22fdd ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 20:37:10 +02:00
Marek Olšák
84f3afc2e1 Revert "egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}"
This reverts commit 8cb84c8477.

This fixes crashing shader-db/run.
2017-10-18 20:23:42 +02:00
Marek Olšák
2cb9ab53dd Revert "egl: drop EGL driver name"
This reverts commit 6414d6bd8d.

This is needed to apply the next revert.
2017-10-18 20:23:24 +02:00
Miklós Máté
f37af5ec8d st/mesa: set dimension for constants in ATI_fragment_shader
This fixes an assertion failure introduced by 30a2f0dfd4.

Fixes: 30a2f0dfd4 ("radeonsi: add an assertion that only

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-10-18 19:36:53 +02:00
Michel Dänzer
8c9e7c9638 st/osmesa: include u_inlines.h for pipe_resource_reference
Fixes build failure due to unresolved symbol.

Fixes: 7561da367b "st/mesa: Initialize textures array in
                     st_framebuffer_validate"

Trivial.
2017-10-18 18:44:58 +02:00
Michel Dänzer
7561da367b st/mesa: Initialize textures array in st_framebuffer_validate
And just reference pipe_resources to it in the validate callbacks.

Avoids pipe_resource leaks when st_framebuffer_validate ends up calling
the validate callback multiple times, e.g. when a window is resized.

v2:
* Use generic stable tag instead of Fixes: tag, since the problem could
  already happen before the commit referenced in v1 (Thomas Hellstrom)
* Use memset to initialize the array on the stack instead of allocating
  the array with os_calloc.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-10-18 18:28:00 +02:00
Eric Engestrom
47273d7312 egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is set
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom
6414d6bd8d egl: drop EGL driver name
The "DRI2" name was reported as confusing when printing EGL infos (one
user reported thinking DRI3 was not working on his X server), and the
only alternative is Haiku, which can only be used on a Haiku machine.

The name therefore doesn't add any information that the user wouldn't
know already, so let's just drop it.

Cc: Kai Wasserbäch <kai@dev.carbon-project.org>
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Related-to: b174a1ae72 ("egl: Simplify the "driver" interface")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom
d7e769abec egl: drop always-false TestOnly option
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Nicholas Miell
3012885b3f Fix the xf86vm meson dependency
The pkg-config file is called xxf86vm.

Signed-off-by: Nicholas Miell <nmiell@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom
8cb84c8477 egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}
Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's
overwritten by the EGL_NOT_INITIALIZED in eglInitialize().

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom
4893673b15 egl_dri2: drop dri2_egl_driver struct
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Eric Engestrom
7823cfe9fe egl_dri2: move glFlush out of struct dri2_egl_driver
There's no reason to store this there, it doesn't depend on the driver.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-18 17:25:41 +01:00
Roland Scheidegger
3d0deed12a llvmpipe: handle shader sample mask output
This probably isn't all that useful for GL, but there are apis where
sample_mask is a valid output even without msaa.
Just discard the pixel if the sample_mask doesn't include the bit for
sample 0.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-18 18:16:44 +02:00
Vinson Lee
c5124fbc74 anv: Fix instance typos.
Fix build error.

  CC       vulkan/vulkan_libvulkan_common_la-anv_device.lo
In file included from vulkan/anv_device.c:33:0:
vulkan/anv_device.c: In function ‘anv_AllocateMemory’:
vulkan/anv_device.c:1562:37: error: ‘struct anv_device’ has no member named ‘instace’; did you mean ‘instance’?
          result = vk_errorf(device->instace, device,
                                     ^
vulkan/anv_private.h:317:17: note: in definition of macro ‘vk_errorf’
     __vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\
                 ^~~~~~~~

Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 09:08:08 -07:00
Brian Paul
e17aa6cd9d mesa: fix trivial typo in _mesa_PixelMapusv() error string
Signed-off-by: Brian Paul <brianp@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103323
2017-10-18 09:53:00 -06:00
Eric Engestrom
2515eb63f8 meson: move expat dependency where it's needed
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-18 14:27:20 +01:00
Hongxu Jia
05fc62d89f automake: intel: move expat handling where it's used
Linking libvulkan_intel.so can fail, due to unresolved references to
libexpat.so.

EXPAT_CFLAGS should be moved as well.

Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-18 14:27:20 +01:00
Timothy Arceri
e5e9e21e9f radv: don't create dummy fs when compiling compute stage
Fixes: d1c9f30d7f "radv: add radv_create_shaders() helper"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 22:47:53 +11:00
Samuel Pitoiset
e6b9abf294 radv: use the dispatch initiator for indirect dispatches
Missed that when I allowed waves to be launched out-of-order.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 11:22:41 +02:00
Samuel Pitoiset
095e709717 radv: remove XtoY_temps structs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 11:22:39 +02:00
Tapani Pälli
6ef9bea734 anv: Install as Vulkan HAL module in Android.mk build
Now that anvil fully implements the Vulkan HAL interface, we can install
it as the vendor HAL module at /vendor/lib/hw/vulkan.${board}.so. To do
so:

  - Rename LOCAL_MODULE to vulkan.$(TARGET_BOARD_PLATFORM).
  - Use LOCAL_PROPRIETARY_MODULE to install under vendor path.

Tested by running different Sascha Williams demos on Android-IA.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
[chadv: Extract this hunk from Tapani's patch, and embed it as
 stand-alone patch in my arc-vulkan series].
Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-18 00:23:38 -07:00
Chad Versace
053d4c328f anv: Implement VK_ANDROID_native_buffer (v9)
This implementation is correct (afaict), but takes two shortcuts
regarding the import/export of Android sync fds.

  Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync
  fd into a VkSemaphore or VkFence, the driver instead simply blocks on
  the sync fd, then puts the VkSemaphore or VkFence into the signalled
  state. Thanks to implicit sync, this produces correct behavior (with
  extra latency overhead, perhaps) despite its ugliness.

  Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export
  a collection of wait semaphores as a sync fd, the driver instead
  submits the semaphores to the queue, then returns sync fd -1, which
  informs the caller that no additional synchronization is needed.
  Again, thanks to implicit sync, this produces correct behavior (with
  extra batch submission overhead) despite its ugliness.

I chose to take the shortcuts instead of properly importing/exporting
the sync fds for two reasons:

  Reason 1. I've already tested this patch with dEQP and with demos
  apps. It works. I wanted to get the tested patches into the tree now,
  and polish the implementation afterwards.

  Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915
  supports neither Android's sync_fence, nor upstream's sync_file, nor
  drm_syncobj. Again, I tested these patches on Android with a 3.18
  kernel and they work.

I plan to quickly follow-up with patches that remove the shortcuts and
properly import/export the sync fds.

Non-Testing
===========
I did not test at all using the Android.mk buildsystem. I may have broke
it. Please test and review that.

Testing
=======
I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel.
The following pass (as of patchset v9):

  - a little spinning cube demo APK
  - several Sascha demos
  - dEQP-VK.info.*
  - dEQP-VK.api.wsi.android.*
      (except dEQP-VK.api.wsi.android.swapchain.*.image_usage, because
      dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT)
  - dEQP-VK.api.smoke.*
  - dEQP-VK.api.info.instance.*
  - dEQP-VK.api.info.device.*

v2:
  - Reject VkNativeBufferANDROID if the dma-buf's size is too small for
    the VkImage.
  - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory
    during vkCreateImage. Instead, directly import its dma-buf during
    vkCreateImage with anv_bo_cache_import(). [for jekstrand]
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not
    exist.

v3:
  - Delete duplicate #include "anv_private.h". [per Tapani]
  - Try to fix the Android-IA build in Android.vulkan.mk by following
    Tapani's example.

v4:
  - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported
    gralloc buffer, just as we do for all other winsys buffers in
    anv_wsi.c. [found by Tapani]

v5:
  - Really fix the Android-IA build by ensuring that Android.vulkan.mk
    uses Mesa' vulkan.h and not Android's.  Insert -I$(MESA_TOP)/include
    before -Iframeworks/native/vulkan/include. [for Tapani]
  - In vkAcquireImageANDROID, submit signal operations to the
    VkSemaphore and VkFence. [for zhou]

v6:
  - Drop copy-paste duplication in vkGetSwapchainGrallocUsageANDROID().
    [found by zhou]
  - Improve comments in vkGetSwapchainGrallocUsageANDROID().

v7:
  - Fix vkGetSwapchainGrallocUsageANDROID() to inspect its
    VkImageUsageFlags parameter. [for tfiga]
  - This fix regresses dEQP-VK.api.wsi.android.swapchain.*.image_usage
    because dEQP wants to create swapchains with
    VK_IMAGE_USAGE_STORAGE_BIT.

v8:
  - Drop unneeded goto in vkAcquireImageANDROID. [for tfiga]

v8.1: (minor changes)
  - Drop errant hunks added by rerere in anv_device.c.
  - Drop explicit mention of VK_ANDROID_native_buffer in
    anv_entrypoints_gen.py. [for jekstrand]

v9:
  - Isolate as much Android code as possible, moving it from anv_image.c
    to anv_android.c. Connect the files with anv_image_from_gralloc().
    Remove VkNativeBufferANDROID params from all anv_image.c
    funcs. [for krh]
  - Replace some intel_loge() with vk_errorf() in anv_android.c.
  - Use © in copyright line. [for krh]

Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v5)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> (v9)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v9)
Cc: zhoucm1 <david1.zhou@amd.com>
Cc: Tomasz Figa <tfiga@chromium.org>
2017-10-18 00:23:38 -07:00
Chad Versace
9775894f10 anv: Move size check from anv_bo_cache_import() to caller (v2)
This change prepares for VK_ANDROID_native_buffer. When the user imports
a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user
provides no size. The driver must infer the size from the internals of
the gralloc buffer.

The patch is essentially a refactor patch, but it does change behavior
in some edge cases, described below. In what follows, the "nominal size"
of the bo refers to anv_bo::size, which may not match the bo's "actual
size" according to the kernel.

Post-patch, the nominal size of the bo returned from
anv_bo_cache_import() is always the size of imported dma-buf according
to lseek(). Pre-patch, the bo's nominal size was difficult to predict.
If the imported dma-buf's gem handle was not resident in the cache, then
the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize,
4096).  If it *was* resident, then the bo's nominal size was whatever
the cache returned. As a consequence, the first cache insert decided the
bo's nominal size, which could be significantly smaller compared to the
dma-buf's actual size, as the nominal size was determined by
VkMemoryAllocationInfo::allocationSize and not lseek().

I believe this patch cleans up that messy behavior. For an imported or
exported VkDeviceMemory, anv_bo::size should now be the true size of the
bo, if I correctly understand the problem (which I possibly don't).

v2:
  - Preserve behavior of aligning size to 4096 before checking. [for
    jekstrand]
  - Check size with < instead of <=, to match behavior of commit c0a4f56
    "anv: bo_cache: allow importing a BO larger than needed". [for
    chadv]
2017-10-17 23:46:06 -07:00
Dylan Baker
fbf39fd7c3 meson: turn on pl111 not vc4 when pl111 driver specificed
Reviewed-by: Eric Anholt <eric@anholt.net>
fixes: 1918c9b162 ("meson: Add support for the pl111 driver.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-17 15:34:35 -07:00
Bas Nieuwenhuizen
06f05040eb radv: Link shaders.
Here we make use of NIR the linking helpers to remove unused
varyings.

Sascha Willems demo results:

computecullandlod 39 -> 41 fps
pipelines ~6100 -> ~6200 fps

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Timothy Arceri
dbbf10541b radv: reuse the multiple shader store & load functions for gs copy variant
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Timothy Arceri
351f9dde60 radv: remove some now unused shader compile code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Timothy Arceri
7d45d22fdd radv: switch to using radv_create_shaders()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen
d1c9f30d7f radv: add radv_create_shaders() helper
This is a combined shader creation helper than will help us to
create the shaders for each stage at once. This will allow us to
do some link time optimisations.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen
ed9218f154 radv: add radv_hash_shaders() helper
This will be used to create a hash of the combined shaders in the
pipeline.

Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen
7f29055751 radv: Add multiple shader cache store & load functions.
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Bas Nieuwenhuizen
670c02b430 radv: Change cache datastructures for combined pipelines.
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-10-18 09:19:35 +11:00
Timothy Arceri
56998558f4 radv: reorder init function calls
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 09:19:35 +11:00
Eric Anholt
4f3e380fa0 meson: Add support for the vc5 driver.
v2: Default vc5 to off, since it requires the simulator currently.  Add
    missing dep on the XML generation from libbroadcom_vc5.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
2017-10-17 13:41:59 -07:00
Eric Anholt
1918c9b162 meson: Add support for the pl111 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Eric Anholt
1ae8018a6a meson: Add support for the vc4 driver.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 13:41:59 -07:00
Marek Olšák
2f4705afde radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer
SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0
if there is no other buffer there.

Benefits:
- there is no constbuf descriptor upload and shader load

It's assumed that all constant addresses are within bounds. Non-constant
addresses are clamped against the last declared CONST variable.
This only works if the state tracker ensures the bound constant buffer
matches what the shader needs.

Once we get 32-bit pointers, we can only do this for user constant buffers
where the driver is in charge of the upload so that it can guarantee a 32-bit
address.

The real performance benefit might not be measurable.

These apps get 100% theoretical benefit in all shaders (except where noted):
- antichamber
- barman arkham origins
- borderlands 2
- borderlands pre-sequel
- brutal legend
- civilization BE
- CS:GO
- deadcore
- dota 2 -- most shaders
- europa universalis
- grid autosport -- most shaders
- left 4 dead 2
- legend of grimrock
- life is strange
- payday 2
- portal
- rocket league
- serious sam 3 bfe
- talos principle
- team fortress 2
- thea
- unigine heaven
- unigine valley -- also sanctuary and tropics
- wasteland 2
- xcom: enemy unknown & enemy within
- tesseract
- unity (engine)

Changed stats only:
    SGPRS: 2059998 -> 2086238 (1.27 %)
    VGPRS: 1626888 -> 1626904 (0.00 %)
    Spilled SGPRs: 7902 -> 7865 (-0.47 %)
    Code Size: 60924520 -> 60982660 (0.10 %) bytes
    Max Waves: 374539 -> 374526 (-0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
854593b8eb ac: clean up ac_build_indexed_load function interfaces
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
cdb21dfffa radeonsi: handle 64-bit loads earlier in fetch_constant
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
ee0e1a47ce radeonsi: add si_descriptors::gpu_address and remove buffer_offset
This allows us to change the pointer arbitrarily.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
6d2664880c radeonsi: unify code for extracting a buffer address from a descriptor
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
8d2685d129 radeonsi: remove atom parameter from si_upload_descriptors
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
4ddce1b1a4 radeonsi: pack si_descriptors better again
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
859eeffb3d radeonsi: emit dirty consecutive pointers in one SET_SH_REG packet
IB size: -1.6%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
36626ffe46 radeonsi: split si_emit_shader_pointer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
69325fa88d radeonsi: generalize the SI_VS_SHADER_POINTER_MASK macro
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
79c2e7388c radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON
IB size: -0.4%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
b762a08896 radeonsi/gfx9: move RW_BUFFERS from s[0:1] to s[8:9] for HS and GS
Let's use the same user data SGPRs in all stages.
(for SPI_SHADER_USER_DATA_COMMON_0)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
0aafedbbb2 radeonsi: add GFX-IB-size query to the HUD
It shows the sum of all IBs per frame.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
4d944c72b1 winsys/amdgpu: disable CPU caching for GFX & SDMA IBs
This should decrease IB fetch latency.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Marek Olšák
49f5ce39c1 winsys/amdgpu: don't do read-modify-write on command buffers
i.e. don't use |=

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-17 22:03:03 +02:00
Eric Anholt
cde209960c broadcom/vc4: Fix false-positive for the tiling ioctls on simulator mode.
If there happened to be an ENOENT laying around, we would try using the
ioctls later and fail out resource allocation.
2017-10-17 12:35:16 -07:00
Eric Anholt
b202f90f65 broadcom/vc4: Skip BO labeling when in simulator mode.
It was calling down into i915 trying to label the BO, which is definitely
not the right thing.
2017-10-17 12:35:16 -07:00
Eric Anholt
d623a34ab2 broadcom/vc5: Don't forget to set the RT format for 1555 textures.
Fixes dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb5_a1
2017-10-17 12:35:16 -07:00
Chad Versace
b5dc551014 anv: Add func anv_gem_get_tiling()
Will use in VK_ANDROID_native_buffer.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Chad Versace
eb69a61806 anv: Move close(fd) from anv_bo_cache_import to its callers (v2)
This will allow us to implement VK_ANDROID_native_buffer without dup'ing
the fd. We must close the fd in VK_KHR_external_memory_fd, but we should
not in VK_ANDROID_native_buffer.

v2:
  - Add missing close(fd) for case
    VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD_BIT_KHR, subcase
    ANV_SEMAPHORE_TYPE_BO.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Chad Versace
076a279a1a anv: Add field anv_image::planes[]::bo_is_owned (v2)
If this flag is set, then the image and the bo have the same lifetime.
vkDestroyImage will release the bo.

We need this for VK_ANDROID_native_buffer, because that extension
creates the VkImage *and* imports its memory during the same
call, vkCreateImage.

v2: Rebase onto VK_KHR_bind_memory2.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Chad Versace
a9ca8f370d anv: Better support for Android logging (v2)
In src/intel/vulkan/*, redirect all instances of printf, vk_error,
anv_loge, anv_debug, anv_finishme, anv_perf_warn, anv_assert, and their
many variants to the new intel_log functions. I believe I caught them
all.

The other subdirs of src/intel are left for a future exercise.

v2:
  - Rebase onto Tapani's VK_EXT_debug_report changes.
  - Drop unused #include <cutils/log.h>.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Chad Versace
aa716db0f6 intel: Add simple logging façade for Android (v2)
I'm bringing up Vulkan in the Android container of Chrome OS (ARC++).

On Android, stdio goes to /dev/null. On Android, remote gdb is even more
painful than the usual remote gdb. On Android, nothing works like you
expect and debugging is hell. I need logging.

This patch introduces a small, simple logging API that can easily wrap
Android's API. On non-Android platforms, this logger does nothing fancy.
It follows the time-honored Unix tradition of spewing everything to
stderr with minimal fuss.

My goal here is not perfection. My goal is to make a minimal, clean API,
that people hate merely a little instead of a lot, and that's good
enough to let me bring up Android Vulkan.  And it needs to be fast,
which means it must be small. No one wants to their game to miss frames
while aiming a flaming bow into the jaws of an angry robot t-rex, and
thus become t-rex breakfast, because some fool had too much fun desiging
a bloated, ideal logging API.

If people like it, perhaps we should quickly promote it to src/util.

The API looks like this:

    #define INTEL_LOG_TAG "intel-vulkan"
    #define DEBUG

    intel_logd("try hard thing with foo=%d", foo);

    n = try_foo(...);
    if (n < 0) {
        intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__);
        return VK_ERROR_DEVICE_LOST;
    }

And produces this on non-Android:

    intel-vulkan: debug: try hard thing with foo=93
    intel-vulkan: error: anv_device.c:182: foo failed bigtime

v2: Fix meson build. [for dcbaker]

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Tapani Pälli
3555d36139 anv/android: Link to libsync, liblog in Android.mk
chadv: I made this patch by extracting the hunk from Tapani's patch in
https://lists.freedesktop.org/archives/mesa-dev/2017-September/169602.html.

Signed-off-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Chad Versace
3791fe23af anv/android: Link to Android libraries in the autotools build
A first step to supporting Vulkan on ARC++. Mesa on ARC++ uses
Autotools, not Android.mk.

Doing this now, even before VK_ANDROID_native_buffer is implemented,
allows us to incrementally add Android support to the Autotools build.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-17 11:08:26 -07:00
Eric Engestrom
320018be77 meson: s/radv_extensions/radv_extensions_c/ to respect var convention
Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-17 19:07:09 +01:00
Eric Engestrom
1f0e80f897 meson: track python script dependency
Suggested-by: Andres Gomez <agomez@igalia.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-17 19:07:03 +01:00
Henri Verbeet
3de87f7cd7 vulkan/wsi: Free the event in x11_manage_fifo_queues().
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Fixes: e73d136a02 ("vulkan/wsi/x11: Implement FIFO mode.")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com
2017-10-17 17:17:15 +01:00
Eric Engestrom
cde7859273 meson: add missing radv_extensions.c generation for libvulkan_radeon
Fixes: 17201a2eb0 "radv: port to using updated anv entrypoint/extension generator."
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-10-17 16:19:21 +01:00
Jason Ekstrand
759ab66db0 anv/apply_pipeline_layout: Use nir_tex_instr_remove_src
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-17 07:36:00 -07:00
Jason Ekstrand
41c75b5354 nir: Add a helper for adding texture instruction sources
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-17 07:36:00 -07:00
Mark Thompson
31fb7bbe0b st/va: Return correct width and height for encode/decode support
Previously this would return the largest possible buffer size, which is
much larger than the codecs themselves support.  This caused confusion
when client applications attempted to decode 8K video thinking it was
supported when it isn't.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Mark Thompson
ba28c1c9f7 st/va: Fix config entrypoint handling
Consistently use it as a PIPE_VIDEO_ENTRYPOINT.

v2: Return an error if the entrypoint is not set (Christian).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Mark Thompson
b6f41e393e st/va: Disable vaExportSurfaceHandle()
This is not in libva 2.0, so it shouldn't be enabled yet.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
2017-10-17 08:23:55 -04:00
Dave Airlie
35c66f3e40 radv/image: bump all the offset to uint64_t.
So one of the CTS tests tries to allocate a 16384x1 2048 array
texture. This overflows a bunch of calculations when we want it
tiled as the heights goes to 128.

addrlib returns us the correct size (16GB or so), but we mangle
it in the htile calcs due to the 32-bit offset fields, then
userspace gives us the reduced number and we try to allocate
it on a heap and things blow up.

We really need to give the app back the correct size for the
image so we can blow up properly in memory allocation later.

This should fix hangs in
dEQP-VK.pipeline.render_to_image.core.1d_array.huge.width_layers.r8g8b8a8_unorm_d32_sfloat_s8_uint
since
Fixes: ad3d98da9f (radv: enable tc compatible htile for d32s8 also.)

Now there's an open question if we should be enabling tc-compat
htile at all for shallow textures like the above.

This might cause some other wierd side effects in CTS even
without the tc compat so:
Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-17 08:28:48 +01:00
Dave Airlie
17201a2eb0 radv: port to using updated anv entrypoint/extension generator.
This ports radv to using the anv entrypoint/extension generator
code.

No differences on enabled extensions list in vulkaninfo.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-17 16:50:32 +10:00
Dave Airlie
c00256a12c radv: enable VK_KHX_multiview always.
This was in the wrong place.

Fixes: ba51ad2f2 (radv: Expose VK_KHX_multiview.)
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-17 16:48:14 +10:00
Marek Olšák
5d071bf04b Revert "mesa: fix texture updates for ATI_fragment_shader"
This reverts commit 9d54025cd1.

It breaks KOTOR.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
2017-10-17 04:16:17 +02:00
Miklós Máté
1b86dbc144 mesa: remove redundant NULL check in update_single_program_texture_state
update_single_program_texture() never returns NULL.

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-17 04:16:17 +02:00
Dylan Baker
43a6e84927 meson: build mesa test.
v2: - add dependency on dispatch.h generator (which this test needs)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2017-10-16 16:39:26 -07:00
Dylan Baker
c7081a3b08 .travis: Don't build gallium drivers in non-gallium test targets
Simply disable gallium in non-gallium builds. For some reason the
gallium driver wont link on ubuntu 14.04 (it will on 16.04, debian
testing, and arch)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-16 16:32:43 -07:00
Dylan Baker
61631be3a9 meson: refactor meson_options
To put one argument on each line. This results in the file being much
longer, but I think much more readable.

Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
6a9ad20b7c meson: build llvmpipe
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
de24d61765 meson: build softpipe
This doesn't include llvmpipe.

v2: - Fix inconsistent use of with_gallium_swrast and
      with_gallium_softpipe.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
813b4b09f9 meson: build nouveau (gallium) driver
Tested with a GK107.

v2: - Add target for nouveau standalone compiler. This target is not
      built by default.
v3: - Add nouveau to list of drivers built by default

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
b154b44ae3 meson: build radeonsi gallium driver
This hooks up the bits necessary to build gallium dri drivers, with
radeonSI as the first example driver. This isn't tested yet.

v4: - drop radeonsi generated header from sources.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
66c94b9313 meson: build gallium winsys for dri, null, and wrapper
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
66f97f6640 meson: build radeonsi
This builds the radeonsi (and radeon) window system bits and gallium
driver bits.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
f3d03a2cf7 meson: Build gallium dri state tracker
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
4d701ee969 meson: build gallium helper drivers
This builds ddebug, noop, rbug, and trace drivers.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric at anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
d451a11b21 meson: Build gallium pipe-loader
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-16 16:32:43 -07:00
Dylan Baker
50c28dfa81 meson: split and simplify dependencies
Rather than group dependencies in complex groups, use a flatter
structure with split dependencies to avoid checking for the same
dependencies twice.

v2: - Fix building vulkan drivers without gallium or dri drivers
v3: - Drop TODO comment that is done
    - Fix typo in commit message

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
b1b65397d0 meson: Build gallium auxiliary
v2: - guard gallivm files with "with_llvm" instead of "dep_llvm.found()"

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2017-10-16 16:32:43 -07:00
Dylan Baker
af9d276134 meson: build libmesa_gallium
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
02cf3a8f39 meson: Add option to toggle LLVM
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
8e611878c4 meson: always set GLX_USE_TLS
This can be applied to all GLX implementations, and in autotools this is
guarded only by the --enable-glx-tls flag. Since this is on by default
in autotools, and is strictly better than being off, the meson build
doesn't even have a toggle for it.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
90b5ec6c5f meson: Don't try to install dri drivers unless one is built
This confused the with_dri flag which is meant to control Direct
Rendering Infrastructure, not classic drivers

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
601bd7296f meson: Set _GNU_SOURCE
When we start adding non-free software platforms support we'll need to
guard this, but for now it should be fine as is.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
e21e0a6a70 meson: add checks for version script and dynamic list
These are used by gallium drivers.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Dylan Baker
e4796ab7c8 configure: commit test files
These are currently auto-generated, but meson needs the same files, so
lets commit them to reduce duplication.

v3: - Rename .build to build-support

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-16 16:32:43 -07:00
Dylan Baker
3b209e9304 meson: Add switch for texture float
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 16:32:43 -07:00
Kenneth Graunke
9e779e59b2 Revert "i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2"
This reverts commit d80cbbeaff.

It turns out that formats do matter - the framebuffer's miptree has an
sRGB format, and the one we created did not.  This broke rendering when
using KWin compositing, GNOME Terminal Fedora (with a transparent
background), and Qt menu rendering in general, to name a few.

It's been a month and this hasn't been fixed, and I'm sick of reverting
this patch or applying NAK'd hacks and restarting various programs at
random times every day, multiple times a day, to keep my desktop
environment functional.

The only benefit of this patch was to prepare the way for modifiers,
which AFAIK aren't finished yet anyway, so there's really no downside
to reverting it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102924
2017-10-16 16:02:53 -07:00
Rob Herring
c6e584f194 Android: add libmesa_nir dependency to libmesa_dricore
Commit 32fcced7b4 ("meta: Unset the textures_used_by_txf bitfield.")
added a dependency in libmesa_dricore to NIR headers, but failed to add
libmesa_nir as a dependency resulting in a build error:

In file included from external/mesa3d/src/mesa/drivers/common/meta.c:90:
external/mesa3d/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found

Add libmesa_nir as a static library dependency to libmesa_dricore.

Fixes: 32fcced7b4 ("meta: Unset the textures_used_by_txf bitfield.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-16 14:49:37 -05:00
Chris Wilson
2c4097aff1 i965: Only put external handles into the handle ht
We know that we will only ever need to lookup an external handle and so
can defer adding a bo to the external ht until it is ever exported or
imported, keeping that hashtable compact.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-16 11:52:59 -07:00
Eric Engestrom
b05820621d svga: format the version string like the rest of mesa
All 4 other version strings do it like this.
((Also, double parentheses just look confusing))

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-16 18:52:41 +01:00
Eric Engestrom
16be271c6e git_sha1_gen: use git_sha1.h.in on all build systems
Meson already uses this, let's get the other build sys to use it too.

Note: rstrip() was dropped, as truncating to the first 10 chars already
gets rid of the terminating newline (not an issue with the env var
either, unless maliciously crafted to break the build... not sure this
is a real-world issue).

Verified to work and give the same output as before on both python 2
and 3 :)

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-16 18:52:35 +01:00
Brian Paul
4542a63254 svga: fix format_conversion_table breakage
The new A1B5G5R5_UNORM, X1B5G5R5_UNORM formats were added in the
wrong place in commit ef874ee450.

Fixes: ef874ee450 "gallium: Add support for 5551 with the 1-bit field in the low bit."

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-16 10:58:02 -06:00
Jason Ekstrand
92d3f21ec2 i965/miptree: Drop the invalidate parameter form copy_teximage
This was a leftover from i915.  The one caller in i965 always passes in
false so there's no point in having the parameter.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-16 08:06:02 -07:00
Jason Ekstrand
b03b19f558 anv: Get rid of gen fall-through
In the early days of the Vulkan driver, we thought it would be a good
idea to just make genN just fall back to the genN-1 code if it didn't
need to be any different for genN.  While this seemed like a good idea,
it ultimately ended up being far simpler to just recompile everything.
We haven't been using the fall-through functionality for some time so
we're better off just deleting it so it doesn't accidentally start
causing problems.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-16 08:04:56 -07:00
Jason Ekstrand
9cec35579c intel/common: Improve the comments for sample positions
These are pulled directly from brw_multisample_state.h

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-16 08:04:56 -07:00
Samuel Pitoiset
f16382d35b radv: update ia_multi_vgt when executing secondary buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-16 14:50:30 +02:00
Samuel Pitoiset
47d7d18613 radv: be smarter with the draw packets when executing secondary buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-16 14:50:27 +02:00
Samuel Pitoiset
b253f3189a radv: always dirty some states after executing secondary buffers
The spec requires the number of buffer to be greater than 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-16 14:09:51 +02:00
Samuel Pitoiset
4e65b4ea4b radv: be smarter with pipelines when emitting secondary buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-16 14:09:51 +02:00
Jakob Bornecrantz
67dd52e7e8 docs: Add EXT_memory_objects extensions to features.txt
These extensions are good for Vulkan interop, so track them.

Signed-off-by: Jakob Bornecrantz <jakob.bornecrantz@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-16 11:05:41 +01:00
Timothy Arceri
f1eb5e6399 nir: add component level support to remove_unused_io_vars()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
9f7127f5d2 glsl: mark xfb inputs as always_active_io
We won't split varyings marked as always active because there
is no point in doing so. This means we need to mark both
sides of the interface as always active otherwise we will have
a mismatch and start removing things we shouldn't.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
6af5e0bec9 nir: add variant of lower_io_to_scalar to be called earlier
This is intended to be called before nir_lower_io() so that we
can do some linking optimisations with the results. It can also
be used with drivers that don't use nir_lower_io() at all such
as RADV.

v2: pass mode mask rather than first and last stage integer.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
3b59f5ca17 nir: add glsl_channel_type() helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
421c1b9bd6 nir: add glsl_type_is_64bit() to nir_types
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-16 09:06:53 +11:00
Ilia Mirkin
790b5c4a38 a2xx: add support for a few 16-bit color rendering formats
The rest should be possible too, just needs some additional
investigation. Passes fbo-*-formats piglit tests.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-15 12:09:21 -04:00
Wladimir J. van der Laan
d3af7f5153 freedreno/a20x: Enable rendering to RGBA/RGBX
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-15 12:09:14 -04:00
Wladimir J. van der Laan
c10eeb454d freedreno/a20x: Fix rendering to BGRX
Make sure that BGRX rendering is swapped the correct way around.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-10-15 12:09:03 -04:00
Brian Paul
c7a81dcea9 mesa: minor simplification in test_attachment_completeness()
We already have a pointer to the texture object.  Use it here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-14 10:30:27 -06:00
Lucas Stach
4daee6733f etnaviv: rework TS enable to be a derived state
Draw operations should not use the TS if the TS buffer content is invalid,
as this leads to wrong rendering or even GPU hangs. As the TS valid status
can change between draws (clear operations changing it to valid, blits using
the RS to the color or ZS buffer changing it to invalid), the TS_MEM_CONFIG
must be updated before each draw if the status has changed.

This fixes the remaining TS related piglit failures (regressions of a
standard run against a piglit run with TS completely disabled).

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-14 16:40:08 +02:00
Lucas Stach
34360ac6ed etnaviv: skip unused vertex attributes when assigning VS inputs
When not all of the vertex attributes are actually used in the shader,
we end up with some inputs without an assigned reg. Those are marked
as invalid and must be skipped when assigning the inputs, as those would
overwrite other valid inputs otherwise.

Fixes piglit drawpixels and a bunch of other tests using the st_draw path.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-14 16:39:46 +02:00
Samuel Pitoiset
0c1aecf177 radv: do not allocate CMASK for non-MSSA images with 128 bit formats
This saves some useless CMASK initializations/eliminations in
the Vulkan SSAO demo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-14 12:25:48 +02:00
Samuel Pitoiset
a4c08c8cd5 radv: set correct INDEX_TYPE for indexed indirect draws on GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-14 12:05:19 +02:00
Samuel Pitoiset
3e5f27faf3 radv: add the draw count buffer to the list of buffers
My guess is that the GPU is going to report VM faults if
vkCmdDrawIndirectCountAMD() (and friends) are used.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-14 12:05:19 +02:00
Jason Ekstrand
1cec500c69 blob: Use intptr_t instead of ssize_t
ssize_t is a GNU extension and is not available on Windows or MacOS.
Instead, we use intptr_t which should be effectively equivalent and is
part of the C standard.  This should fix the Windows and Mac OS builds.

Fixes: 3af1c82989
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103253
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-10-13 15:02:34 -07:00
Kenneth Graunke
77d3d71f23 i965: Rename brw->no_batch_wrap to intel_batchbuffer::no_wrap
This really makes more sense in the intel_batchbuffer struct.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-13 11:16:41 -07:00
Kenneth Graunke
d22bc4ba52 i965: Delete dead brw_context fields.
fast_clear_op is leftover from the meta-fast-clear days.
No idea what the other thing was for, but it isn't used now.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-13 11:16:41 -07:00
Emil Velikov
9b753e8ca3 mapi/shared-glapi/test: rework glapitable.h handling
Currently all the build systems but Meson generate the header in
src/mapi/glapi. Meson cannot do that since:
 - it does not allow user control over the location of output files
 - moving the generation rule(s) causes explosion due to the unusual
structure of glapi and friends
 - copying the file into the correct location is a non-trivial task

To workaround the above deficiency in the least invasive way, let's
adjust the #include directive and add a few -I flags to the autotools
build.

Note: both builddir and srcdir, should be used. Otherwise building from
a release tarball fails badly.

Cc: Dylan Baker <dylanx.c.baker@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-13 11:12:08 -07:00
Dylan Baker
142dc8b9de meson: fix blob test includes
Since blob.h moved up to src/compiler the test should include that
instead of src/compiler/glsl

fixes: 0e3bd56c6e ("compiler: Move blob up a level")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-13 10:40:23 -07:00
Emil Velikov
ee779c93d5 Revert "make: Fix test to be meson compatible"
This reverts commit fc48ad2427.

There commit reference the previous commit as it justification of
changing behaviour. Although unlike the said commit, there's nothing
obviously wrong there.

I'll take a look close why Meson fails to pick the file, but in the
interim reverting this commit fixes the normal distcheck target.
2017-10-13 14:57:33 +01:00
Mark Thompson
e7f24859ca st/dri: Add definitions to allow importing 16-bit surfaces
Necessary to support P010/P016 surfaces for video.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Leo Liu <leo.liu@amd.com>
2017-10-13 08:11:47 -04:00
Mario Kleiner
556037f131 i965: Complete 'expose RGBA visuals only on Android'
Commit 731ba6924a
"expose RGBA visuals only on Android" replaced
ARRAY_SIZE(formats) by num_formats, but there are
3 loops which add configs, and only one was updated
to num_formats.

Also update loops for configs with accumulation buffer
and multisample configs.

Fixes: 731ba6924a "i965: expose RGBA visuals only on Android"
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-13 12:56:13 +01:00
Emil Velikov
df3a430180 configure.ac: add missing LLVM components for OpenCL
Coverage and LTO seems to be hard requirements for Clang, while
coroutines is needed as of LLVM/Clang 4.0.

Mark the last one as "optional" so we handle every case.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-10-13 12:56:13 +01:00
Emil Velikov
36d6d1e931 configure.ac: add llvm_add_optional_component helper
We want to add "optional" components, which have been added with later
LLVM versions.

One such in-tree example is inteljitevents. Others are to follow
shortly.

v2: Use the correct function, add blank line between functions (Tobias)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-10-13 12:56:13 +01:00
Emil Velikov
a7ecf7b86f Travis: add binutils 2.26 for a few more LLVM 3.9 builds
Otherwise we error out at link stage as follows:

/usr/lib/llvm-3.9/lib/libLLVMAMDGPUCodeGen.a(R600OptimizeVectorRegisters.cpp.o):
unrecognized relocation (0x2a) in section
`.text._ZNK12_GLOBAL__N_119R600VectorRegMerger16getAnalysisUsageERN4llvm13AnalysisUsageE'
/usr/bin/ld: final link failed: Bad value

Cc: mesa-stable@lists.freedesktop.org
Cc: Jan Vesely <jan.vesely@rutgers.edu
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-13 12:56:13 +01:00
Emil Velikov
13a53c4f5c configure.ac: rework llvm libs handling for 3.9+
Earlier versions need different quirks, but as of LLVM 3.9 llvm-config
provides --link-shared/link-static toggles.

The output of which seems to be reliable - looking at LLVM 3.9, 4.0 and
5.0.

Note that there are earlier code will be used for pre LLVM 3.9 and is
unchanged.

This effectively fixes LLVM static linking, while providing a clearer
and more robust solution for future versions.

Mildly interesting side notes:

 - build-mode (introduced with 3.8) was buggy with 3.8
It shows "static" when build with -DLLVM_LINK_LLVM_DYLIB=ON, yet it was
consistent with --libs. The latter shows the static libraries.

 - libnames and libfiles are broken with LVM 3.9
The library prefix and extension is printed twice liblibLLVM-3.9.so.so

v2: Invoke llvm-config twice, instead of using sed, to combine the two
lines into one (Tobias)

Cc: mesa-stable@lists.freedesktop.org
Cc: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-10-13 12:56:12 +01:00
Emil Velikov
98fdff7247 configure.ac: factor out detection for old and buggy llvm
As of LLVM 3.9 one could use consistent ways to handle the component.
Factor out the current handling, as it will be used for older versions.

Cc: mesa-stable@lists.freedesktop.org
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-10-13 12:56:12 +01:00
Emil Velikov
9032e2cdcc configure.ac: remove no longer necessary llvm-config --libs check
Prior to the refactor/cleanup by Tobias one could add an invalid
component to LLVM_COMPONENTS.

Since that's no longer the case we can drop the current check.

Cc: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-10-13 12:56:12 +01:00
Emil Velikov
66ebdfbd44 eglmesaext: add forward declaration for struct wl_buffers
The user does not need to know the specifics of the struct, as only a
pointer to it is used.

Just forward declare the struct making the header self-contained.

v2: Remove deprecation warning text/bugzilla - patch does no help there.

Cc: Greg V <greg@unrelenting.technology>
Fixes: 5cddb1ce3c ("wayland: Add an extension to create wl_buffers from
EGLImages")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2017-10-13 12:56:12 +01:00
Emil Velikov
a14ecdab16 configure.ac: bump Clover LLVM requirement to 3.9
The only driver that utilises Clover already depends on LLVM 3.9.
Close to every supported distribution has said version.

Additionally libclc also requires LLVM 3.9.

With this in mind, we can safely bump the requirement.

There is a handful of dead code that we could remove, which will be
resolved with later commits.

Note: this drops the LLVM 3.6 build from the Travis build. LLVM 3.9 (and
later) are already covered in there.

https://lists.freedesktop.org/archives/mesa-dev/2017-September/170028.html

v2: Add reference to discussion thread (Eric), adjust libclc LLVM req.
(Jan).

Cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-10-13 12:56:12 +01:00
Emil Velikov
acb84ffbc7 wayland-drm: constify the callbacks struct, take 2
Now that wayland-drm (correctly) keeps a local copy of the callbacks,
this should not longer cause explosions.

After all the symbol is a local, constant data.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Derek Foreman <derekf@osg.samsung.com>
2017-10-13 12:56:12 +01:00
Emil Velikov
0cfd6f6cfc wayland-drm: use a copy of the wayland_drm_callbacks struct
The callbacks may be called even when they are no longer valid.
Say, the user is dlclose(ing) libEGL while the buffers are being
destroyed.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Derek Foreman <derekf@osg.samsung.com>
2017-10-13 12:56:12 +01:00
Emil Velikov
872a373bc8 egl/dri: don't crash when createImageFromRenderbuffer2 is NULL
The __DRI_IMAGE version can be 17 or over, while the function pointer is
NULL. Guard for that instead of crashing.

Fixes: bad24395d9 ("egl/dri: use createImageFromRenderbuffer2 when
available")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-13 12:56:12 +01:00
Ville Syrjälä
2289964f4f meson: Build i915
Build i915 with meson. More or less copied from i965, with all
the unneeded cruft removed, and the libdrm_intel dependency added.

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-13 14:29:00 +03:00
Ville Syrjälä
66b1597a88 meson: Fix xf86vm dep
The pkg-config file is called xxf86vm.pc not xf86vm.pc.

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-13 14:28:41 +03:00
Jason Ekstrand
79d403417c intel/cs: Make thread_local_id a regular builtin param
This is a lot more natural than special casing it all over the place.
We still have to do a bit of special-casing in assign_constant_locations
but it's not special-cased quite as bad as it was before.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
8d90e28839 intel/compiler: Allocate pull_param in assign_constant_locations
Now that everything is nicely ralloc'd, we can allocate the pull_param
array in assign_constant_locations instead of higher up.  We can also
re-allocate the param array so that it's exactly the needed size.  This
should save us some memory because we're not allocating the total needed
param space for both push and pull.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
29737eac98 intel: Allocate prog_data::[pull_]param deeper inside the compiler
Now that we're always growing the param array as-needed, we can
allocate the param array in common code and stop repeating the
allocation everywere.  In order to keep things sane, we ralloc the
[pull_]param array off of the compile context and then steal it back
to a NULL context later.  This doesn't get us all the way to where
prog_data::[pull_]param is purely an out parameter of the back-end
compiler but it gets us a lot closer.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
c3d54d0375 ralloc: Allow reparenting to a NULL context
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
2e317a4b6d anv/pipeline: Refactor setup of the prog_data::param array
Now that the only thing we put in the array up-front are client push
constants, we can simplify anv_pipeline_compile a bit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
6b31229592 anv/pipeline: Grow the param array for images
Before, we were calculating up-front and then filling in later.  Now we
just grow as needed in anv_nir_apply_pipeline_layout.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
63c938fd18 anv/pipeline: Whack nir->num_uniforms to MAX_PUSH_CONSTANT_SIZE
This way any image uniforms end up having locations higher than
MAX_PUSH_CONSTANT_SIZE.  There's no bug here at the moment, but this
consistency will make the next commit easier.  Also, because
nir_apply_pipeline_layout properly increments nir->num_uniforms when
it expands the param array, we no longer need to stomp it to match
prog_data::nr_params because it already does.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:31 -07:00
Jason Ekstrand
4dfb8b3416 intel/vs: Grow the param array for clip planes
Instead of requiring the caller of brw_compile_vs to figure it out, just
grow the param array on-demand.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
6bcc5c0c75 intel/cs: Grow prog_data::param on-demand for thread_local_id_index
Instead of making the caller of brw_compile_cs add something to the
param array for thread_local_id_index, just add it on-demand in
brw_nir_intrinsics and grow the array.  This is now safe to do because
everyone is now using ralloc for prog_data::param.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
b1d1b7222a intel/compiler: Make brw_nir_lower_intrinsics compute-specific
It's already only ever called from brw_compile_cs and only handles
compute intrinsics.  Let's just make it CS-specific.  We can always
make it handle other stages again later if we want.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
2db9470d88 intel/compiler: Add a helper for growing the prog_data::param array
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
c0435b204a intel/compiler: Stop adding params for texture sizes
We haven't needed this ever since we started using NIR for lowering
rectangle textures.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
4d4f149376 i965: Only add the wpos state reference if we lowered something
Otherwise, in the ARB program case _mesa_add_state_reference may grow
the parameter array which will cause brw_nir_setup_arb_uniforms to write
past the end of the param array because it only looks at the parameter
list length but the parma array is allocated based on nir->num_uniforms.
The only reason this hasn't caused us problems is because we are padding
out the param array for fragment programs unnecessarily.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
4efd079aba intel/compiler: Add a flag for pull constant support
The Vulkan driver does not support pull constants.  It simply limits
things such that we can always push everything.  Previously, we were
determining whether or not to push things based on whether or not the
prog_data::pull_param array is non-null.  This is rather hackish and
about to stop working.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
9df64b5666 anv/pipeline: Ralloc prog_data::param of the compile mem_ctx
This way we stop leaking it.  This is completely safe because, when we
hand it off to anv_shader_bin_create or anv_pipeline_cache_upload_kernel,
they make a copy of the entire param array.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
490d80fd1a anv/pipeline: Add a mem_ctx parameter to anv_pipeline_compile
This lets us avoid some of the manual ralloc stealing and prepares for
future commits in which we will want to ralloc prog_data::param.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
cfc7ed75eb i965: Store image_param in brw_context instead of prog_data
This burns an extra 10k of memory or so in the case where you don't have
any images.  However, if you have several shaders which use images, this
should be much less memory.  It also gets rid of a part of prog_data
that really has nothing to do with the compiler.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:30 -07:00
Jason Ekstrand
6ee4b352c9 i965: Use prog->info.num_images for needs_dc computation
This should be just as good as looking in prog_data but removes our one
state setup dependency on brw_stage_prog_data::nr_image_param.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Jason Ekstrand
2975e4c56a intel: Rewrite the world of push/pull params
This moves us away to the array of pointers model and onto a model where
each param is represented by a generic uint32_t handle.  We reserve 2^16
of these handles for builtins that get generated by somewhere inside the
compiler and have well-defined meanings.  Generic params have handles
whose meanings are defined by the driver.

The primary downside to this new approach is that it moves a little bit
of the work that we would normally do at compile time to draw time.  On
my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to
have any measurable affect on OglBatch7.  So, while this may come back
to bite us, it doesn't look too bad.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Jason Ekstrand
faad828b16 i965: Get rid of gen7_cs_state.c
The only thing it was handling was push constants.  We pull the actual
constant upload code into gen6_constant_state.c and the atoms into
genX_state_upload.c.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Jason Ekstrand
9b3f917f9e i965: Add a helper for populating constant buffers
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Jason Ekstrand
d640627159 i965: Move brw_upload_pull_constants to gen6_constant_state.c
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:39:29 -07:00
Jason Ekstrand
3442c9fc3e nir: Get rid of the variable on vote intrinsics
This looks like a copy+paste error.  They don't actually write into that
variable as would be implied by putting the return there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Jason Ekstrand
a0947921eb nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0.  Switching it to bit >= 0 solves this problem.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Jason Ekstrand
ac3b73ac8d meta: Delete the PBO texsubimage path for real
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 22:38:40 -07:00
Jason Ekstrand
b8ab78d1af anv/pipeline_cache: Rework to use multialloc and blob
This gets rid of all of our hand-rolled size calculation and
serialization code and replaces it with safe "standards" that are used
elsewhere in anv and mesa.  This should be significantly safer than
rolling our own.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
2d29dd9ee4 anv/pipeline: Declare bind maps closer to their use
This is just a trivial cleanup.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
ba4b7e9c44 anv/multialloc: Add new add_size helper
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
6a41a52e62 compiler/blob: Make some parameters void instead of uint8_t
There are certain advantages to using uint8_t internally such as
well-defined arithmetic on all platforms.  However, interfaces that
work in terms of raw data should use a void* type.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
4d56ff0a71 compiler/blob: Constify the reader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
3af1c82989 compiler/blob: Add (reserve|overwrite)_(uint32|intptr) helpers
These helpers not only call blob_reserve_bytes but also make sure that
the blob is properly aligned as if blob_write_* were called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Connor Abbott
6935440967 compiler/blob: make blob_reserve_bytes() more useful
Despite the name, it could only be used if you immediately wrote to the
pointer. Noboby was using it outside of one test, so clearly this
behavior wasn't that useful. Instead, make it return an offset into the
data buffer so that the result isn't invalidated if you later write to
the blob. In conjunction with blob_overwrite_bytes(), this will be
useful for leaving a placeholder and then filling it in later, which
we'll need to do for handling phi nodes when serializing NIR.

v2 (Jason Ekstrand):
 - Detect overflow in the offset + to_write computation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
8ae03af4ed compiler/blob: Allow for fixed-size blobs with a NULL data pointer
These can be used to easily count up the number of bytes that will be
required by "writing" it into the NULL blob.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
26f6d4e5c7 compiler/blob: Add a concept of a fixed-allocation blob
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
49bb9f785a compiler/blob: Switch to init/finish instead of create/destroy
There's no reason why that tiny bit of memory needs to be on the heap.
We always put blob_reader on the stack, so why not do the same with the
writable blob.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
0e3bd56c6e compiler: Move blob up a level
We're going to want to use the blob for Vulkan pipeline caching so it
makes sense to have it in libcompiler not libglsl.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
8f42a43d08 meson: Add inc_compiler to the libglsl includes 2017-10-12 21:47:06 -07:00
Jason Ekstrand
e03717efbd glsl/blob: Return false from grow_to_fit if we've ever failed
Otherwise we could have a failure followed by a smaller write that
succeeds and get a corrupted blob.  If we ever OOM, we should stop.

v2 (Jason Ekstrand):
 - Initialize the new boolean member in create_blob

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 21:47:06 -07:00
Jason Ekstrand
7118851374 glsl/blob: Return false from ensure_can_read on overrun
Otherwise, if you have a large read fail and then try to do a small
read, the small read may succeed even though it's at the wrong offset.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 21:47:06 -07:00
Chris Wilson
c866e0b3ca i965: Share the flush for brw_blorp_miptree_download into a pbo
As all users of brw_blorp_miptree_download() must emit a full pipeline
and cache flush when targetting a user PBO (as that PBO may then be
subsequently bound or *be* bound anywhere and outside of the driver
dirty tracking) move that flush into brw_blorp_miptree_download()
itself.

v2 (Ken): Rebase without userptr stuff so it can land sooner.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
760a5815d4 meta: Delete the PBO texture upload/download path
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
cdf626294e i965: Use blorp instead of meta for PBO pixel reads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
f933ef00e1 i965: Use blorp instead of meta for PBO texture downloads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
157faa407f i965/tex: Use blorp texture upload for all CCS_E textures
This improves the FillTex benchmark in GLBench 2.7 by 30% on my Broxton.
On Ken's Broxton which only has single-channel ram, it improves by 210%.

v2 (Ken): Check mt->aux_usage == ISL_AUX_USAGE_CCS_E rather than using
          intel_miptree_is_lossless_compressed().

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
dffda6cbbb i965: Use blorp instead of meta for PBO texture uploads
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
1a05bbe6a4 i965: Add blorp-based texture upload and download paths
v1 (Topi Pohjolainen): original patch.

v2 (Topi Pohjolainen):
   - Fix return value (s/MESA_FORMAT_NONE/false/) (Anuj)
   - Move _mesa_tex_format_from_format_and_type() just
     in the end avoiding additional if-block (Anuj)
   - Explain better the array alignment restriction (Anuj)
   - Do not bail out in case of gl_pixelstore_attrib::ImageHeight,
     it is handled by _mesa_image_offset() automatically (Ken).
   - Support 1D_ARRAY by flipping depth, width and y, z (Ken).

v3 (Topi Pohjolainen):
   - Contrary to v2, do not try to handle
     gl_pixelstore_attrib::ImageHeight. Currently there are no
     tests in piglit or cts for it. One could possibly copy or
     modify tests/texturing/texsubimage.c. There, however, seems
     to be number of corner cases to consider. Moreover, current
     meta path applies the packing height for both source and
     targets when determining the offset. This would probably
     require re-visiting also.

v4 (Topi Pohjolainen): Rebased on top of merged drm-bacon

v5 (Jason Ekstrand):
   - Move to brw_blorp.c
   - Significant refactoring
   - Fixed 1-D array textures
   - Simplified handling of PBOs vs. CPU data.
   - Handle gl_pixelstore_attrib::ImageHeight.  It turns out there are
     piglit tests that cover this. The original version was failing them
     because of an error in the way it handled 1-D array textures.
   - Add support for texture download

v6 (Kenneth Graunke): Rebase fixes:
   - Use intel_miptree_check_level_layer instead of deleted fields
   - Update for mesa_format_supports_render[] rename.
   - Pass 'false' (read-only) to intel_bufferobj_buffer

v7 (Kenneth Graunke):
   - Fix brw_blorp_download_miptree to pass 'false' (not read only) for
     the destination buffer (caught by Chris Wilson).
   - Fix blorp_get_client_bo to pass intel_bufferobj_buffer !read_only
     for the 'writable' parameter instead of 'false' (caught by Jason).
   - Support GL_BGR, GL_BGRA, GL_BGRA_INTEGER, GL_BGR_INTEGER, allowing
     us to use this for ReadPixels on the window system buffer (caught
     by Chris Wilson).
   - Fix y-flipping bugs in download path (exposed by BGRA support).
   - Fix false vs. NULL return value in blorp_get_client_bo.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-12 19:58:40 -07:00
Kenneth Graunke
acd3e073e4 i965: Refactor y-flipping coordinate transform.
I want to reuse it for the BLORP download path.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
52f39d6910 i965/tex: Check if there is data to upload up-front
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
d9ed4f6c32 i965/barrier: Do the correct flushes for framebuffer access
Framebuffer access includes framebuffer reads so we need to invalidate
the texture cache.  We do not, however, need to flush the depth cache
because you cannot do bind a depth texture as an image.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Jason Ekstrand
45991479a3 i965/barrier: Do the correct flushes for texture updates
Texture uploads and downloads may go through the render pipe which may
result in texturing from or rendering to the texture or the PBO.  We
need to flush accordingly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 19:58:40 -07:00
Eric Anholt
2f1cdd7137 include: Revert out the update of the Khronos GLX extension header.
They made a mistake in the MESA_swap_control XML, which I'm pursuing in
their github.  Until then, we can just back this piece out.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
2017-10-12 19:49:14 -07:00
Kenneth Graunke
cb9a4ae6c0 i965: Ignore GL_SKIP_DECODE_EXT for textures accessed via texelFetch().
The GL_EXT_texture_sRGB_decode spec says:

"The conversion of sRGB color space components to linear color space is
 always performed if the texel lookup function is one of the texelFetch
 builtin functions.

 Otherwise, if the texel lookup function is one of the texture builtin
 functions or one of the texture gather functions, the conversion of sRGB
 color space components to linear color space is controlled by the
 TEXTURE_SRGB_DECODE_EXT parameter.

 If the TEXTURE_SRGB_DECODE_EXT parameter is DECODE_EXT, the conversion
 of sRGB color space components to linear color space is performed.

 If the TEXTURE_SRGB_DECODE_EXT parameter is SKIP_DECODE_EXT, the value
 is returned without decoding. However, if the texture is also accessed
 with a texelFetch function, then the result of texture builtin functions
 and/or texture gather functions may be returned with decoding or without
 decoding."

This patch makes i965 force sRGB decoding for any textures accessed via
texelFetch().  If textures are accessed via texelFetch() and a regular
texture access function, this will affect the other ones too - which is
fine - it's undefined according to the last paragraph quoted.

We could make both work, but we'd have to emit multiple SURFACE_STATEs,
and have two binding table sections, like we do for texture gather hacks
on older platforms.

Fixes the following Android O CTS test:
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-12 17:22:42 -07:00
Kenneth Graunke
32fcced7b4 meta: Unset the textures_used_by_txf bitfield.
Drivers that use Meta are happily using blitting data using texelFetch
and GL_SKIP_DECODE_EXT, but the GL_EXT_texture_sRGB spec unfortunately
makes GL_SKIP_DECODE_EXT not necessarily work with texelFetch.

As a hack, just unset the texture_used_by_txf bitfield so we can
continue with the old desired behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-12 17:22:42 -07:00
Kenneth Graunke
a576c148cd nir: Make nir_shader_gather_info() track texelFetch texture accesses.
For TGSI-based drivers, st_glsl_to_tgsi records this information.
For NIR-based drivers, nir_shader_gather_info() will do so.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-12 17:22:42 -07:00
Kenneth Graunke
fbf4c2916c compiler: Move gl_program::TexelFetchSamplers to shader_info.
I'd like to put this sort of metadata in the shader_info structure,
rather than adding more things to gl_program.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 17:22:39 -07:00
Dave Airlie
fb972ed4e5 radv: take unsafe_math and sisched into account when hashing shaders.
We want to generate different variants for sisched and unsafe_math
shader variants, so add them to the hash key.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-12 23:46:55 +01:00
Dave Airlie
26f1ba94a3 mesa/bufferobj: fix atomic offset/size get
When I realigned the bufferobj code, I didn't see the getters
were different, realign the getters to work the same as ssbo.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103214
Fixes: 65d3ef7cd (mesa: align atomic buffer handling code with ubo/ssbo (v1.1))
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-13 07:53:34 +10:00
Marek Olšák
69730dc589 relnotes: document EGL_ANDROID_native_fence_sync on radeonsi 2017-10-12 22:27:55 +02:00
Eric Anholt
89e02db81f include: Update GL headers from khronos opengl registry.
Taken from their c6a99aff31874697741a08cbc8a3488606ce59c7, keeping the
BUILDING_MESA hunk in place.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 12:45:07 -07:00
Eric Anholt
6de8f1f970 mapi: Update extension number of MESA_tile_raster_order.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-12 12:44:51 -07:00
Eric Anholt
dbf9e4fbf8 broadcom/vc5: Remove the u_resource_vtbl usage.
Like for vc4, this was just a wasted indirection.
2017-10-12 12:44:27 -07:00
Eric Anholt
376a0a9b08 mesa: Disallow GL_RED/GL_RG with half-floats on GLES2.
Sure, you'd think that the combination of GL_OES_texture_half_float and
GL_EXT_texture_rg would mean that GL_RG16F exists, but it doesn't.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103227
Fixes: c16a7443e9 ("mesa: Expose GL_OES_required_internalformat on GLES contexts.")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 12:42:13 -07:00
Marek Olšák
f536f45250 radeonsi: implement sync_file import/export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:07:48 +02:00
Marek Olšák
162502370c winsys/amdgpu: implement sync_file import/export
syncobj is used internally for interactions with command submission.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:07:41 +02:00
Marek Olšák
11adea4b24 ac: add radeon_info::has_sync_file
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 21:04:56 +02:00
Eric Anholt
20b91cd568 broadcom/vc5: Don't pair VPMSETUP with other peripheral access.
The specs don't say you can't, but pairing it with an SFU write on the
7268 breaks all our simple shader tests using gl_MVP * gl_Vertex.
2017-10-12 10:41:09 -07:00
Eric Anholt
dc9fa4bfb3 broadcom/vc5: Fix inclusion of FS flag bits in dumping the FS address. 2017-10-12 10:41:09 -07:00
Marek Olšák
255573996c st/dri: implement __DRIimageExtension::validateUsage properly
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
Marek Olšák
07fdc0a09c gallium: add pipe_screen::check_resource_capability
This is optional (and no CAP).

Implemented by radeonsi, ddebug, rbug, trace.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
Marek Olšák
5f2073be32 ac/surface: add ac_surface::is_displayable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
Marek Olšák
c3f3685fd6 amd/addrlib: add Addr2IsValidDisplaySwizzleMode
Some "standard" (_S) swizzle modes are displayable on Raven,
even though the micro tile mode says it's not displayable.
Expose the addrlib function to the driver.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 19:03:33 +02:00
tournier.elie
1233d32d2a meson: fix typo in isl
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-12 09:39:07 -07:00
Rob Herring
137b32b815 Android: disable i9x5 drivers on non-x86 builds
The i965 driver has become dependent on x86 specific compiler builtin
functions, so ensure it's disabled for non-x86 builds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-12 09:06:09 -05:00
Wladimir J. van der Laan
78ade65956 etnaviv: Do GC3000 resolve-in-place when possible
If an RS blit is done with source exactly the same as destination, and
the hardware supports this, do an in-place resolve. This only fills in
tiles that have not been rendered to using information from the TS.

This is the same as the blob does and potentially saves significant
bandwidth when doing i.MX6qp scanout using PRE, and when rendering to
textures (though here using sampler TS would be even better).

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-10-12 16:03:26 +02:00
Eric Engestrom
3ba5a467a5 egl_haiku: drop haiku_egl_driver struct
The struct only contained the one field we're interested in.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-12 14:55:10 +01:00
Eric Engestrom
3188b6e45f egl: remove left over _EGLMain_t
Fixes: b174a1ae72 "egl: Simplify the "driver" interface"
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-12 14:43:48 +01:00
Eric Engestrom
4a6c7e8ad8 egl: drop memset(0) of calloc'ed memory
`_EGLDriver *drv` is a freshly calloc()'ed object, memset(0)'ing some of
it is a no-op.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-12 14:40:12 +01:00
Eric Engestrom
9690759d0c egl: replace _egl_driver->Unload() callback with a simple free()
Bonus: fixes a memleak on haiku when unloading the driver

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-12 14:38:55 +01:00
Dave Airlie
6049fa454e radv: don't crash if cache is disabled.
If you set MESA_GLSL_CACHE_DISABLE, radv crashed.

Fixes: fd24be134f (radv: make use of on-disk cache)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-12 14:14:43 +02:00
Samuel Pitoiset
4f42ea4dcf radv: use CLEAR_STATE for initializing some registers
Based on RadeonSI.

This improves some Vulkan demos by +1% to +3%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-12 09:17:43 +02:00
Samuel Pitoiset
65dcff7a26 radv: add has_clear_state and enable it on CIK+ only
This will allow us to emit the CLEAR_STATE packet instead
of a bunch of useless packets when doing CS initialization.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-12 09:17:40 +02:00
Samuel Pitoiset
c74ed3966e radv: do not set registers for merged ES-GS on GFX9
Based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-12 09:17:38 +02:00
Samuel Pitoiset
1789cac6dd radv: move the raster config emission in si_set_raster_config()
Similar to RadeonSI, also only call this function for <= VI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-12 09:17:35 +02:00
Nicolai Hähnle
bc2d874101 radeonsi: add support for PIPE_FORMAT_{X1,A1}R5G5B5_UNORM
Fixes dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-12 08:42:55 +02:00
Nicolai Hähnle
9f55da130e gallium: add tests for PIPE_FORMAT_{X1,A1}B5G5R5_UNORM formats
This is a left-over from my version of adding the new format
after rebasing on Eric's version.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-12 08:42:55 +02:00
Dave Airlie
a3ba14d0ce include/drm-uapi: clarify when headers can be updated.
Clarify when headers can be updated here.

Reviewed-by: Gurchetan Singh<gurchetansingh@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-12 09:44:27 +10:00
Timothy Arceri
0061a90550 radv: remove duplicate line of code
The same line of code is a few lines above.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-12 08:52:39 +11:00
Timothy Arceri
fd24be134f radv: make use of on-disk cache
If the app provided in-memory pipeline cache doesn't yet contain
what we are looking for, or it doesn't provide one at all then we
fallback to the on-disk cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-12 08:52:38 +11:00
Timothy Arceri
1421625292 radv: create on-disk shader cache
This is the drivers on-disk cache intended to be used as a
fallback as opposed to the pipeline cache provided by apps.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-12 08:52:38 +11:00
Timothy Arceri
7664aaf331 radv: remove duplicate debug_flags field
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-12 08:52:38 +11:00
Lionel Landwerlin
e568d2bd1f anv: intel: use anv_image's computed size for importing a BO
Rather than relying on size = stride * height, we can rely on
anv_image's total size.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-10-11 22:29:55 +01:00
Lionel Landwerlin
c0a4f56fb9 anv: bo_cache: allow importing a BO larger than needed
It's not a problem if a BO has been allocated larger than we need it
to be.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102940
Fixes: 818b857914 ("anv: Use the BO cache for DeviceMemory allocations")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-11 22:29:55 +01:00
Nicolai Hähnle
cf3dd91969 st/glsl_to_tgsi: the second destination doesn't support relative addressing
It's not used -- DFRACEXP gets array indexes of its exponent out-parameter
lowered earlier -- and it wouldn't have worked correctly anyway when both
dst and dst1 use relative addressing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:17:08 +02:00
Nicolai Hähnle
3b666aa747 st/glsl_to_tgsi: fix DFRACEXP with only one destination
Replace the undefined destination by a new temporary register.

Cleanup merge_two_dsts while we're at it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:17:06 +02:00
Nicolai Hähnle
541208cf13 st/glsl_to_tgsi: fix indirect access to 64-bit integer
Make sure we actually allocate two adjacent TGSI temporaries. The
current code fails e.g. when an arithmetic operation has two
operands with indirect accesses.

I will send out a new piglit test
(arb_gpu_shader_int64/execution/indirect-array-two-accesses.shader_test)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:16:56 +02:00
Nicolai Hähnle
2991c0d7df st/mesa: don't assign prog->ShadowSamplers
It's not used, and the assignment for the TGSI case was incorrect
for sampler arrays.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:16:21 +02:00
Nicolai Hähnle
0e26e767d2 st/glsl_to_tgsi: ignore GL_TEXTURE_SRGB_DECODE_EXT for samplers used with texelFetch*()
See the comment for the relevant spec quote.

Fixes dEQP-GLES31.functional.srgb_texture_decode.skip_decode.srgba8.texel_fetch

v2: note the interaction between ARB_bindless_texture and EXT_texture_sRGB_decode
    as a TODO

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:16:20 +02:00
Nicolai Hähnle
bce3055c69 st/mesa: store state that affects sampler views per context
This fixes sequences like:

1. Context 1 samples from texture with sRGB decode enabled
2. Context 2 samples from texture with sRGB decode disabled
3. Context 1 samples from texture with sRGB decode disabled

Previously, step 3 would see the prev_sRGBDecode value from context 2
and would incorrectly use the old sampler view with sRGB decode enabled.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-11 23:16:20 +02:00
Tim Rowley
e484805352 swr: simd16 shaders work in progress
Start building vertex shaders as simd16.

Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-11 14:35:23 -05:00
Tim Rowley
9cad9cbaf8 gallium: allow 512-bit vectors
Increase the max allowed vector size from 256 to 512.

No piglit llvmpipe regressions running on avx2.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-11 14:34:31 -05:00
Kenneth Graunke
fe7fab4be5 i965: Drop brw_bo_alloc in ARB_indirect_parameters implementation.
The original implementation allocated a new BO here, but we decided to
switch to intel_upload_space, which returns a reference to the current
upload BO.  We accidentally kept the brw_bo_alloc, even though it's no
longer necessary - intel_upload_space will immediately unreference it,
causing us to allocate and immediately free a buffer.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-10-11 12:22:29 -07:00
Kenneth Graunke
e401cfa28a i965: Allow mapped VBOs during drawing in non-debug contexts.
Section 6.3.2 of the GL 4.5 spec says:

   "Any GL command which attempts to read from, write to, or change
    the state of a buffer object may generate an INVALID_OPERATION error
    if all or part of the buffer object is mapped ... However, only
    commands which explicitly describe this error are required to do so.
    If an error is not generated, such commands will have undefined
    results and may result in GL interruption or termination."

Setting this flag allows us to skip walking over the buffer bindings
for every enabled vertex attribute (_mesa_all_buffers_are_unmapped).

Improves performance in GFXBench4's gl_driver2_off microbenchmark by
3.05797% +/- 0.709031% (n=33) on Apollolake.

This breaks KHR-*.draw_elements_base_vertex_tests.invalid_mapped_bos,
but that test is invalid and has been removed from the upstream CTS.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-11 12:22:25 -07:00
Dylan Baker
3c66a461f3 meson: fix glx test
That requires a generated header that was rolled into a loop.

fixes: a47c525f32 ("meson: build glx")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-11 10:45:31 -07:00
Ilia Mirkin
b20bccbcac nv50,nvc0: fix push hint logic in presence of a start offset
Previously buffer offsets were passed in explicitly as an offset, which
had to be added to the resource address. Now they are passed in via an
increased 'start' parameter. As a result, we were double-adding the
start offset in this kind of situation.

This condition was triggered by piglit's draw-elements test which has a
requisite glMultiDrawElements in combination with a small enough number
of vertices to go through the immediate push path.

Fixes: 330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
Reported-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-11 08:18:16 -04:00
Kenneth Graunke
735711ab86 i965: Make brw_update_texture_surface static.
Trivial.  It's not used in other files.
2017-10-11 00:09:06 -07:00
Rob Herring
e5e93c727f Android: fix build break from r600/radeon split
Commit 06bfb2d28f ("r600: fork and import gallium/radeon") broke the
Android build:

external/mesa3d/src/gallium/drivers/radeon/r600_pipe_common.c:43:10: fatal error: 'llvm-c/TargetMachine.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~

Update the Android makefiles so that drivers/radeon is only built when
radeonsi (and therefore LLVM) is enabled.

Fixes: 06bfb2d28f (r600: fork and import gallium/radeon)
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-10 21:37:19 -05:00
Rob Herring
c3f75d483c Android: move libraries to /vendor
As part of Treble project in Android O, all the device specific files have
to be located in a separate vendor partition. This is done by setting
LOCAL_PROPRIETARY_MODULE (the name is misleading). This change will not
break existing platforms without a vendor partition as it will just move
files to /system/vendor.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-10-10 21:37:16 -05:00
Kenneth Graunke
6f5abf3146 i965: Fix output register sizes when multiple variables share a slot.
ARB_enhanced_layouts allows multiple output variables to share the same
location - and these variables may not have the same sizes.  For
example, consider these output variables:

   // consume X/Y/Z components of 6 vectors
   layout(location = 0) out vec3 a[6];

   // consumes W component of the first vector
   layout(location = 0, component = 3) out float b;

Looking at the first declaration, we see that VARYING_SLOT_VAR0 needs 24
components worth of space (vec3 padded out to a vec4, 4 * 6 = 24).  But
looking at the second declaration, we would think that VARYING_SLOT_VAR0
needs only 4 components of space (a single float padded out to a vec4).

nir_setup_outputs() only considered the space requirements of the first
declaration it happened to see, so if 'float b' came first, it would
underallocate the output register space, causing brw_fs_validator.cpp
to assert fail about inst->dst.offset exceeding the register size.

Fixes Piglit's tests/spec/arb_enhanced_layouts/execution/component-layout/
vs-to-fs-array-interleave-single-location.shader_test.

Thanks to Tim Arceri for finding this bug and writing a test!

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-10 17:29:37 -07:00
Dave Airlie
2d36efdb7f nir: bump loop unroll limit to 96.
With the ssao demo from Vulkan demos:
radv/rx480: 440->440fps
anv/haswell: 24->34 fps

The demo does a 0->32 loop across a ubo with 32 members.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 10:11:36 +10:00
Dave Airlie
5be3fdfa32 anv: fix assert in wsi image code.
This assert was firing just running demos.

Jason said it should be this.

Fixes: 6c7720ed78 (anv/wsi: Allocate enough memory for the entire image)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:52:57 +10:00
Dave Airlie
9926af0e71 mesa/st: fix atomic buffer sizing to align with ssbo.
This respects the size from the range setting like ssbo.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:10:21 +10:00
Dave Airlie
3e156b89ed mesa/bufferobj: consolidate some buffer binding code.
These paths are again 90% the same, consolidate them into
one.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:10:21 +10:00
Dave Airlie
35ac13ed32 mesa/bufferobj: consolidate some codepaths between ubo/ssbo/atomics.
These are 90% the same code, consolidate them into a couple of
common codepaths.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:10:21 +10:00
Dave Airlie
d2bfa76045 mesa: rename various buffer bindings to one struct.
One binding to bind them all, these are all the same thing.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:10:20 +10:00
Dave Airlie
65d3ef7cd4 mesa: align atomic buffer handling code with ubo/ssbo (v1.1)
this adds automatic size support to the atomic buffer code,
but also realigns the code to act like the ubo/ssbo code.

v1.1:
add missing blank lines.
reindent one block properly.
check for NullBufferObj.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 09:10:20 +10:00
Kenneth Graunke
03087686ff i965: Don't try to decode types for non-existent src1.
KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks has a MOV that hits
this validation path.  MOVs don't have a src1 file, but calling
brw_inst_src1_type() was tripping on src1.file being BRW_IMMEDIATE_VALUE
and the hw_type being something invalid for immediates.

To work around this, just pretend src1 is src0 if there isn't a src1.

Fixes: 2572c2771d (i965: Validate "Special
       Requirements for Handling Double Precision Data Types")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102680
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-10 15:11:35 -07:00
Karol Herbst
eab078f132 main/format: skip format conversion if src and dst format are equal
Fixes 'KHR-GL45.copy_image.functional' on Nouveau and i965.

v2: (by Kenneth Graunke)
    Rewrite patch according to Jason Ekstrand's review feedback.
    This makes it handle differing strides, which i965 needed.

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 15:06:17 -07:00
Jason Ekstrand
51e7879544 mesa: Make _mesa_get_format_bytes handle array formats.
This is easier than making callers handle a bunch of special cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-10 15:06:17 -07:00
Bas Nieuwenhuizen
96f80c8d4d radv: Only set the MTYPE flags on GFX9+.
Older kernels fail the va_op with this flag set. If the kernel
supports GFX9 usefully, it will also support this flag.

Fixes: e8d57802fe "radv/gfx9: allocate events from uncached VA space"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:57:33 +10:00
Kenneth Graunke
ea0d2e98ec i965: Disable auxiliary buffers when there are self-dependencies.
Jason and I investigated several OpenGL CTS failures where the tests
bind the same texture for rendering and texturing, at the same time.
This has defined results as long as the reads happen before writes,
or the regions are non-overlapping.  Normally, this just works out.

However, CCS can cause problems.  If the shader is reading one set of
pixels, and writing to different pixels that are adjacent, they may end
up being covered by the same CCS block.  So rendering may be writing a
CCS block, while the sampler is trying to read it.  Corruption ensues.

Disabling CCS is unfortunate, but safe.

Fixes several KHR-GL45.texture_barrier.* subtests.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 14:57:04 -07:00
Dave Airlie
96e85709df r600: cleanup llvm ir target selection.
Only r600 target used now for compute IR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:40:04 +10:00
Dave Airlie
ce0ee31890 r600: drop tc_L2_dirty bit, this was SI only.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:39:32 +10:00
Dave Airlie
80bbdb1483 radeonsi: lower ffma in nir to mad.
This lowers ffma to a * b + c.

This seems like it should keep Marek happiest, so
we'd never get to the fma instruction emission code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:33:32 +10:00
Dave Airlie
2c61594d84 radv: lower ffma in nir.
So it appears the Vulkan SPIR-V fma opcode can be equivalent to a
mad operation, and the fma hw opcode on AMD hw is issued like a double
opcode so is slower. Also the radeonsi stack does this.

This appears to improve performance on a number of games from Feral,
and thanks to Feral for noticing the problem.

I'm reposting this one as Marek indicated he thinks this is what
we should be doing on AMD hw.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:31:27 +10:00
Alex Smith
25d76fd658 radv: Add R16G16B16A16_SNORM fast clear support
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 07:29:43 +10:00
Eric Anholt
2687183a34 broadcom/vc5: Fix handling of 5551 textures using the new gallium format.
Like vc4, we have the alpha in the low bit.  Fixes a bunch of piglit
texwrap failures.
2017-10-10 11:42:06 -07:00
Eric Anholt
f4b5158874 broadcom/vc5: Set the RCL's MSAA mode to match the BCL's MSAA state. 2017-10-10 11:42:06 -07:00
Eric Anholt
ae9a56db6a braodcom/vc5: Set up clear color for higher-bpp formats.
Fixes arb_color_buffer_float-clear
2017-10-10 11:42:06 -07:00
Eric Anholt
c0561808c0 broadcom/vc5: Set up per-MRT clear colors.
Fixes fbo-mrt-alphatest.
2017-10-10 11:42:06 -07:00
Eric Anholt
5208d2889e broadcom/vc5: Fix blendfactor zero handling.
I cut the line out to move it up to the top, when putting "0" in the
switch made the compiler complain that that wasn't a valid enum.
2017-10-10 11:42:06 -07:00
Eric Anholt
ffdba7fd4c broadcom/vc5: Fix Rendering Mode Common Config's color store bitmask.
This controls the RTs that get stored by the default resolved store, the
same way that the extended resolved store packet has a RT bitmask.
2017-10-10 11:42:06 -07:00
Eric Anholt
4b7de2a360 broadcom/vc5: Add support for f32 render targets.
The TLB write code is getting ugly and needs a refactoring (that will
hopefully handle TLBU uniform coalescing as well).
2017-10-10 11:42:06 -07:00
Eric Anholt
f2e6e1bbc3 broadcom/vc5: Fix color masks for non-independent blending.
This gets fbo-mrt-alphatest working except for the second RT's clear color.
2017-10-10 11:42:06 -07:00
Eric Anholt
476db7e66b broadcom/vc5: Make the BCL's number of render targets setup match the RCL. 2017-10-10 11:42:06 -07:00
Eric Anholt
8b4c00a7b2 braodcom/vc5: Fix tile size setup for MRTs.
We need to divide the TLB in two for the 2nd color buffer, and again if
the 3rd or 4th are present.
2017-10-10 11:42:06 -07:00
Eric Anholt
dc25a83a7a broadcom/vc5: Start hooking up multiple render targets support.
We now emit as many TLB color writes as there are color buffers.
2017-10-10 11:42:05 -07:00
Eric Anholt
f0ee7d6ba8 broadcom/vc5: Add support for GL_EXT_provoking_vertex.
The bit was missing from the spec, but it's there in the simulator.  Fixes
the piglit clipflat test.
2017-10-10 11:42:05 -07:00
Eric Anholt
f4133865d1 braodcom/vc5: Find the actual first TF output for our TF spec.
This doesn't yet support PSIZ, but gets us at least some of TF working.
2017-10-10 11:42:05 -07:00
Eric Anholt
bd94f6821e broadcom/vc5: Fix translation of transform feedback's output_register field.
It's a NIR driver_location, not a slot offset.
2017-10-10 11:42:05 -07:00
Eric Anholt
d8bc9c71df broadcom/vc5: Mark our primitives as needing TF processing.
The TF enable state appears to stick around until the next TF enable
packet is sent, so we only want to request TF when the shader is using it.
2017-10-10 11:42:05 -07:00
Eric Anholt
28105560f7 broadcom/vc5: Fix setup of TF dword output count.
I missed the "- 1" when reading the spec.
2017-10-10 11:42:05 -07:00
Eric Anholt
3ac8a2a4ba broadcom/vc5: Fix up a comment from vc4 about the predraw texture setup. 2017-10-10 11:42:05 -07:00
Eric Anholt
ec5af12b5d broadcom/vc5: Flush the job when mapping a transform feedback buffer.
We will want something fancier for reusing a TF output within the same
frame, but we at least need this in order for piglit tests to work.
2017-10-10 11:42:05 -07:00
Eric Anholt
361c5f28bd broadcom/vc5: Fix handling of interp qualifiers on builtin color inputs.
The interpolation qualifier, if specified, is supposed to take precedence
over glShadeModel().
2017-10-10 11:42:05 -07:00
Eric Anholt
d0dfc4bd5f broadcom/vc5: Fix CLIF dumping of lists that aren't capped by a HALT.
The HW will halt when you hit a HALT packet, or when you hit the end
address.  Tell CLIF if there's an end address is so that it can stop
correctly.  (There was usually a 0 byte after the CL, so it would stop
anyway).
2017-10-10 11:42:05 -07:00
Eric Anholt
7f3b890697 broadcom/vc5: Fix depth and stencil clear values.
I had misread the packet description: We always have a 32f depth, and a
separate u8 stencil.
2017-10-10 11:42:05 -07:00
Eric Anholt
be11251e3c broadcom/vc5: Add missing Z16 format.
We can render to and sample from it just fine.
2017-10-10 11:42:05 -07:00
Eric Anholt
e20c82c550 braodcom/vc5: Fix incorrect early Z writes in discard shaders.
Fixes glsl-fs-discard-02.
2017-10-10 11:42:05 -07:00
Eric Anholt
732a3a72cb broadcom/compiler: Set up passthrough Z when doing FS discards.
In order to keep early-Z from writing early in a discard shader, you need
to set the "modifies Z" bit in the shader state (which the new
prog_data.discards will indicate).  Then, in the shader we do a TLB write
to make Z passthrough happen (the QPU result is ignored, so we use a NULL
source).
2017-10-10 11:42:05 -07:00
Eric Anholt
4c4fbab345 broadcom/compiler: Don't forget the discard state on TLB Z writes.
We don't want to write Z for discarded fragments.
2017-10-10 11:42:05 -07:00
Eric Anholt
84939552d0 broadcom/compiler: Use defines instead of magic values in TLB write setup. 2017-10-10 11:42:05 -07:00
Eric Anholt
c25de31824 broadcom/vc5: Add proper support for base_vertex and base_instance.
I had base_vertex hacked into the shader state setup like in vc4, but it's
not correct for big offsets.  Using the proper packet is easier and
hopefully means we can re-emit shader state setup less frequently.
2017-10-10 11:42:05 -07:00
Eric Anholt
e74a9e8def broadcom/xml: Add the vc5 Base Vertex/Base Instance packet.
This lets us do index_bias and ARB_base_instance.
2017-10-10 11:42:05 -07:00
Eric Anholt
24c8bbbb75 broadcom/vc5: Use supertiles and generic tile lists.
This massively reduces the size of our RCL setup.  It also gets us closer
to supporting multicore platforms.
2017-10-10 11:42:05 -07:00
Eric Anholt
4b2cf771e6 broadcom/xml: Add a bunch more vc5 tile list management packets.
We're going to need these for MSAA, and to use the generic per-tile list.
2017-10-10 11:42:04 -07:00
Eric Anholt
efa329ab4f broadcom/xml: Remove vc5 base packet for tile bin/render mode config.
These existed so I could unpack just the sub-id field to switch on in the
old manual CLIF dumper.  The new codegen handles sub-id automatically, but
only if these stub packets aren't there with an implicit sub-id=0.
2017-10-10 11:42:04 -07:00
Eric Anholt
afb31a9e87 braodcom/xml: Fix a pasteo in vc5 store tile buffer general. 2017-10-10 11:42:04 -07:00
Eric Anholt
45bb8f2957 broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.
V3D 3.3 is a continuation of the 3D implementation in VC4 (v2.1 and v2.6).
V3D 3.3 introduces an MMU (no more CMA allocations) and support for
GLES3.1.  This driver is not currently conformant, though that will be a
target as soon as possible.

V3D 3.x parts use a new texture tiling layout common across many Broadcom
graphics parts including and the HVS scanout engine.  It also massively
changes the QPU instructions, introducing a common physical register file
(no more A/B split) and half-float instructions, while removing the 4x8
unorm instructions in favor of half-float for talking to fixed function
interfaces.  Because so much has changed, vc5 is implemented in a separate
gallium driver, using only the XML code-generation support from vc4.

v2: Fix tile layout for 64bpp textures.  Fix texture swizzling for 32-bit
    returns.  Fix up a bit of MRT setup.  Sync the simulator to kernel
    behavior a bit more.  Improve uniform debugging code.  Rebase on
    QIR->VIR rename.  Move texture state mostly to the CSOs.  Improve
    cache flushing on the simulator.  Fix program deletion
    use-after-frees.

Acked-by: Dave Airlie <airlied@gmail.com> (uabi plan)
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (uabi plan)
2017-10-10 11:42:04 -07:00
Eric Anholt
ade416d023 broadcom: Add VC5 NIR compiler.
This is a pretty straightforward fork of VC4's NIR compiler to VC5.  The
condition codes, registers, and I/O have all changed, making the backend
hard to share, though their heritage is still recognizable.

v2: Move to src/broadcom/compiler to match intel's layout, rename more
    "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts
    with vc4, use new v3d_debug header, add compiler init/free functions,
    do texture swizzling in NIR to allow optimization.
2017-10-10 11:42:04 -07:00
Eric Anholt
f71364f297 broadcom: Add vc5 CLIF dumping
This will be usable with "VC5_DEBUG=cl" on the vc5 driver to stream a CLIF
file (the Broadcom equivalent of i965's AUB) to stderr.  I haven't tested
that this is actually usable with the internal CLIF-consuming tools, but
is close enough as a baseline and is useful for visually inspecting the
command stream.
2017-10-10 11:42:04 -07:00
Eric Anholt
05c7d9715b broadcom: Add V3D 3.3 QPU instruction pack, unpack, and disasm.
Unlike VC4, I've defined an unpacked instruction format with pack/unpack
functions to convert to 64-bit encoded instructions.  This will let us
incrementally put together our instructions and validate them in a more
natural way than the QPU_GET_FIELD/QPU_SET_FIELD used to.

The pack/unpack unfortuantely are written by hand.  While I could define
genxml for parts of it, there are many special cases (like operand order
of commutative binops choosing which binop is being performed!) and it
probably wouldn't come out much cleaner.

The disasm unit test ensures that we have the same assembly format as
Broadcom's internal tools, other than whitespace changes.

v2: Fix automake variable redefinition complaints, add test to .gitignore
2017-10-10 11:42:04 -07:00
Eric Anholt
59257c35eb broadcom: Introduce a v3d_debug.h header for vc5 and broadcom Vulkan.
Unlike vc4, where the compiler and gallium driver live together, for vc5
the compiler will live up in the shared broadcom directory, and need
access to the debug flags.  Define a set of debug flags and helpers there,
so it can be shared between compiler, vc5, and vulkan.
2017-10-10 11:42:04 -07:00
Eric Anholt
ae106592a6 configure: Add the new "vc5" driver to the list, requiring a simulator.
My intent is to develop the vc5 driver in-tree for some time to build the
CL generation and shader compiler code, and keep out-of-tree patches for
talking to an actual kernel driver until the kernel driver can be
stabilized on the hardware.

v2: Define a HAVE_BROADCOM_DRIVERS, like HAVE_INTEL or HAVE_AMD.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-10 11:42:04 -07:00
Eric Anholt
c34295b1a3 nir: Move vc4's alpha test lowering to core NIR.
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).

This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.

v2: Add to meson.build as well.

Acked-by: Rob Clark <robdclark@gmail.com>
2017-10-10 11:42:04 -07:00
Eric Anholt
e37b32f80c mesa: Alphabetize GL_MESA_tile_raster_order in the extensions list.
trivial, fixes make check.
2017-10-10 11:42:04 -07:00
Eric Anholt
e676434856 mesa: Implement a new GL_MESA_tile_raster_order extension.
The intent is to use this extension on vc4 to allow X11 to do overlapping
CopyArea() within a pixmap without first blitting the pixmap to a
temporary.  With associated glamor patches, improves x11perf
-copywinwin100 performance on a Raspberry Pi 3 from ~4700/sec to
~5130/sec, and is an even larger boost to uncomposited window movement
performance (most copywinwin100 copies don't overlap).

v2: Fix glIsEnabled() on the new enums.
v3: Drop the local spec since I'm upstreaming the spec.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Eric Anholt
087b39a346 broadcom/vc4: Expose PIPE_CAP_TILE_RASTER_ORDER
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Split from the core gallium commit, drop some unnecessary code related
    to glBlitFramebuffer(), fix a crash with clears before state has been
    bound.
2017-10-10 10:45:22 -07:00
Eric Anholt
ac0051a507 gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4.
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.

This commit introduces the core gallium support, vc4 changes will follow.

v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Drop vc4 changes from this commit, for clarity.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
2017-10-10 10:45:22 -07:00
Eric Anholt
4aa700e0e0 broadcom/vc4: Implement GL_ARB_texture_barrier.
Improves x11perf -copywinwin100 from ~2000/sec to ~4700/sec.  More
importantly, this is a prerequisite for the new GL_MESA_tile_raster_order
extension.
2017-10-10 10:45:22 -07:00
Eric Anholt
13b303ff92 docs: Update the list of used MESA GL enums.
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Eric Anholt
9ab0d83079 docs: Fix a typo in the old MESA_program_debug spec.
Noticed that we had two 0x8bb4 in the spec while grepping to find an open
slot in the MESA enums set.  gl.xml had the right value.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 10:45:22 -07:00
Brian Paul
a3b2e60aa0 git_sha1_gen: accept MESA_GIT_SHA1_OVERRIDE env var
If one uses a parent build script to download/build Mesa we may not
have a full git repository (maybe a tar archive) so the 'git rev-parse'
command will fail.

This updates the script to look for a MESA_GIT_SHA1_OVERRIDE env var.
If it's set, use that sha1 instead of using git rev-parse.  With this
change we can put a git hash in the GL_VERSION string even when we
don't have a git repo.

v2: incorporate Dylan's suggestions to simplify the code

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-10 11:28:31 -06:00
Brian Paul
c43b0d3f91 mesa: move _mesa_half_is_negative() to half_float.h
v2: use !! in the function to be explicit about type conversion.  Though,
gcc generates the same code with or without the logical !!.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 11:28:31 -06:00
Brian Paul
3c5664b78d mesa: move _mesa_exec_malloc/free() prototypes to their own header
Try to start removing things from the cluttered imports.h file.

v2: add new header to Makefile.sources

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-10 11:28:31 -06:00
Kenneth Graunke
d670dd6b65 i965: minor whitespace fix 2017-10-10 10:18:17 -07:00
Eric Anholt
45f34d733b mesa: Set new renderbuffers to RGBA4 on all GLES contexts.
Before we were doing RGBA4 on GLES3 only, but as of GLES2 2.0.22 it should
be RGBA4 as well.  Fixes DEQP
functional.state_query.rbo.renderbuffer_internal_format.

Tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
c16a7443e9 mesa: Expose GL_OES_required_internalformat on GLES contexts.
This extension is effectively a backport of GLES3's internalformat
handling to GLES 1/2.  It guarantees that sized internalformats specified
for textures and renderbuffers have at least the specified size stored.
That's a pretty minimal requirement, so I think it can be dummy_true and
exposed as a standard in Mesa.

As a side effect, it also allows GL_RGB565 to be specified as a texture
format, not just as a renderbuffer.  Mesa had previously been allowing 565
textures, which angered DEQP in the absence of this extension being
exposed.

v2: Allow 2101010rev with sized internalformats even on GLES3, citing the
    extension spec.  Extend extension checks for GLES2 contexts exposing
    with texture_float, texture_half_float, and texture_rg.
v3: Fix ALPHA/LUMINANCE/LUMINANCE_ALPHA error checking (GLES3 CTS
    failures)
v4: Mark GL_RGB10 non-color-renderable on ES, fix A/L/LA errors on GLES2
    with float formats.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
cee5585da7 mesa: Only expose GLES's EXT_texture_type_2_10_10_10_REV if supported in HW.
Previously, we were downconverting to 8888 automatically if the hardware
didn't suport it.  However, with the advent of
GL_OES_required_internalformat, we have to actually store the
internalformats we advertise support for.  And, it seems rather
disingenuous to advertise the extension if we don't actually support it.

v2: Throw an error when using the format on ES2 without the extension present.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-10 09:31:29 -07:00
Eric Anholt
cbb532429b vc4: Add support for 5551 textures.
This keeps us from promoting them up to 8888, at the cost of not being
color-renderable.
2017-10-10 09:31:29 -07:00
Eric Anholt
ef874ee450 gallium: Add support for 5551 with the 1-bit field in the low bit.
This is how VC4 stores 5551 textures, which we need to support for
GL_OES_required_internalformat.

v2: Extend commit message, fix svga driver build, add BE ordering from
    Roland.
v3: Rebase on PIPE_FORMAT_R10G10B10X2_UNORM addition.

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
2017-10-10 09:31:29 -07:00
Eric Anholt
3078296226 mesa: Add X1B5G5R5 along with A1B5G5R5.
For supporting RGB5 in hardware with A in the low bit (vc4), we need this
format as well.

v2: Add proper _mesa_format_matches_format_and_type() support (from
    Nicolai).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-10-10 09:31:29 -07:00
Nicolai Hähnle
fbcae1897b st_api: remove unused get_resource_for_egl_image
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle
e14fe41e0b st/dri: implement createImageFromRenderbuffer(2)
Tested with dEQP-EGL.functional.image.*renderbuffer* tests.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-10 13:58:48 +02:00
Nicolai Hähnle
4ec2ac11bd egl/dri: remove old left-overs
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
bad24395d9 egl/dri: use createImageFromRenderbuffer2 when available
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
d0d6efcc64 egl/dri: factor out egl_error_from_dri_image_error
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:47 +02:00
Nicolai Hähnle
f12e1c5586 dri_interface: add an error-returning version of createImageFromRenderbuffer
We ought to be able to distinguish between allocation errors and bad
parameters (non-existent renderbuffer object).

Bumps the version of the DRI Image extension to 17.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
9a8f13a33b st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS
Applications might pass in a buffer that is sized too large and rely
on the extra space of the buffer not being overwritten.

Fixes dEQP-GLES31.functional.state_query.internal_format.partial_query.num_sample_counts

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
1b592d30c5 u_threaded_context: fix a memory leak
The uploaders can own transfers which need to be unmapped. Destroy them
before the final sync (they're not used from the driver thread anyway)
so that the transfer_unmap call is processed by the driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:46 +02:00
Nicolai Hähnle
76fcede3f4 disk_cache: remove unnecessary NULL-pointer guards
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:45 +02:00
Nicolai Hähnle
b041bf9f4b disk_cache: fix a memory leak
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:45 +02:00
Nicolai Hähnle
83c54a1402 st/mesa: whitespace fix
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
288dea076e st/mesa: fix import of EGL images with non-zero level or layer
In GL state, textures created from EGL images look like plain 2D textures
with a single level, so we use the existing layer_override facility and
add an analogous level_override one.

Fixes dEQP-EGL.functional.image.create.gles2_cubemap_{positive,negative}_{x,y,z}_rgba_texture

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
d245724399 st/mesa: fix switching from surface-based to non-surface-based textures
This can happen with surface-based texture objects derived from EGL
images, since those aren't immutable.

Fixes tests in dEQP-EGL.functional.sharing.gles2.multithread.random.images.teximage2d.* and others

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-10 13:58:44 +02:00
Nicolai Hähnle
a2c8812f91 glsl/linker: add check for compute shared memory size
Unlike uniforms, the limit on shared memory size is not called out
explicitly in the list of things that cause linker errors, but presumably
that's just an oversight in the spec.

Fixes dEQP-GLES31.functional.debug.negative_coverage.{callbacks,get_error,log}.compute.exceed_shared_memory_size_limit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-10 13:58:43 +02:00
Lucas Stach
ca949e00d8 etnaviv: update HW headers and fix provoking vertex
Now that the real meaning of the 2 bits in PA_SYSTEM_MODE is known,
we can set them according to the rasterizer state, which fixes uses
that are setting provoking vertex first.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-10 12:32:24 +02:00
Lucas Stach
0ab59f120b etnaviv: remove flat shading workaround
It turned out not to be a hardware bug, but the shader compiler
emitting wrong varying component use information. With that fixed
we can turn flat shading back on.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:30:34 +02:00
Lucas Stach
cedab87e76 etnaviv: fix varying interpolation
It seems that newer cores don't use the PA_ATTRIBUTES to decide if the
varying should bypass the flat shading, but derive this from the component
use. This fixes flat shading on GC880+.

VARYING_COMPONENT_USE_POINTCOORD is a bit of a misnomer now, as it isn't
only used for pointcoords, but missing a better name I left it as-is.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:35 +02:00
Lucas Stach
03b1f8ba20 etnaviv: fix bogus flush requests in transfer handling
The logic to decide if we need to flush the GPU command stream was broken
and hard to reason about. Fix and clarify this.

Fixes the data sync subtests from piglit arb_vertex_buffer_object.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-10 12:29:03 +02:00
Iago Toral Quiroga
5ec21eb1a0 i965/tes: account for the fact that dvec3/4 inputs take two slots
When computing the total size of the URB for tessellation evaluation
inputs we were not accounting for this, and instead we were always
assuming that each input would take a single vec4 slot, which could
lead to computing a smaller read size than required. Specifically, this
is a problem when the last input is a dvec3/4 such that its XY components
are stored in the the second half of a payload register (which can happen
if the offset for the input in the URB is not 64-bit aligned because
there are 32-bit inputs mixed in) and the ZW components in the
first half of the next, as in this case we would fail to account for the
extra slot required for the ZW components.

Fixes (requires another fix in CTS currently in review):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-10 08:59:54 +02:00
Tapani Pälli
63e6db18c5 anv: fix null pointer dereference
CID: 1419033

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-10 08:17:44 +03:00
Dave Airlie
4adc456580 radv: export KHR_relaxed_block_layout
This seems to pass all the cts tests it enables.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-10 13:22:44 +10:00
Ilia Mirkin
ce6da2a026 nv50/ir: fix 64-bit integer shifts
TGSI was adjusted to always pass in 64-bit integers but nouveau was left
with the old semantics. Update to the new thing.

Fixes: d10fbe5159 (st/glsl_to_tgsi: fix 64-bit integer bit shifts)
Reported-by: Karol Herbst <karolherbst@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 20:42:59 -04:00
Lionel Landwerlin
8ee6828df7 i965: silence coverity warning
Also makes this statement a bit clearer.

CID: 1418920
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
2017-10-10 00:56:01 +01:00
Józef Kucia
91ba331ef4 anv: Do not assert() on VK_ATTACHMENT_UNUSED
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 16:28:43 -07:00
Józef Kucia
e0acb630a5 spirv: Fix SpvOpAtomicISub
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 16:28:11 -07:00
Timothy Arceri
7a7fb90af7 glsl: tidy up IR after loop unrolling
c7affbf687 enabled GLSLOptimizeConservatively on some
drivers. The idea was to speed up compile times by running
the GLSL IR passes only once each time do_common_optimization()
is called. However loop unrolling can create a big mess and
with large loops can actually case compile times to increase
significantly due to a bunch of redundant if statements being
propagated to other IRs.

Here we make sure to clean things up before moving on.

There was no measureable difference in shader-db compile times,
but it makes compile times of some piglit tests go from a couple
of seconds to basically instant.

The shader-db results seemed positive also:

Totals:
SGPRS: 2829456 -> 2828376 (-0.04 %)
VGPRS: 1720793 -> 1721457 (0.04 %)
Spilled SGPRs: 7707 -> 7707 (0.00 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 3140 -> 2060 (-34.39 %)
Scratch size: 3308 -> 2180 (-34.10 %) dwords per thread
Code Size: 79441464 -> 79214616 (-0.29 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 558670 -> 558571 (-0.02 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
646621c66d glsl: make loop unrolling more like the nir unrolling path
The old code assumed that loop terminators will always be at
the start of the loop, resulting in otherwise unrollable
loops not being unrolled at all. For example the current
code would unroll:

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     j++;
  } while (j < 4);

But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:

  int j = 0;
  do {
     ... do stuff ...

     j++;
  } while (j < 4);

Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     if (bool(i))
        break;

     j++;
  } while (j < 4);

This patch changes this pass to be more like the NIR unrolling pass.
With this change we handle loop terminators correctly and also
handle cases where the terminators have instructions in their
branches other than a break.

V2:
- fixed regression where loops with a break in else were never
  unrolled in v1.
- fixed confusing/wrong naming of bools in complex unrolling.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
d24e16fe1f glsl: check if induction var incremented before use in terminator
do-while loops can increment the starting value before the
condition is checked. e.g.

  do {
    ndx++;
  } while (ndx < 3);

This commit changes the code to detect this and reduces the
iteration count by 1 if found.

V2: fix terminator spelling

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
ab23b759f2 glsl: don't drop instructions from unreachable terminators continue branch
These instructions will be executed on every iteration of the loop
we cannot drop them.

V2:
- move removal of unreachable terminators from the terminator list
  to the same place they are removed from the IR as suggested by
  Nicolai.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Dylan Baker
c63ce5c95d travis: Add a travis profile for meson dri drivers
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-09 13:55:12 -07:00
Dylan Baker
68c91264eb travis: don't run ninja test for meson
This pulls in tons of extra dependencies because the tests are not
properly guarded.

v2: - Put this patch before the one that adds a loader/dri test for
      meson

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-10-09 13:54:46 -07:00
Dylan Baker
c2cd5801cd meson: build classic swrast
This adds support for building the classic swrast implementation. This
driver has been tested with glxinfo and glxgears.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
816bf7d164 meson: build gbm
This doesn't include egl support, just dri support.

v2: - when gbm is set to 'auto', only build if a dri driver is also
      enabled
    - Fix conditional to check for x11 modules with vulkan as well as
      with dri drivers
v3: - Set pkgconfig libraries.private value

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
db9788420d meson: Add support for configuring dri drivers directory.
v2: - drop with_ from dri_drivers_path variable (Eric A)
v3: - Move HAVE_X11_PLATFORM to the proper patch (Eric A)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
a47c525f32 meson: build glx
This gets GLX and the loader building. The resulting GLX and i965 have
been tested on piglit and seem to work fine. This patch leaves a lot of
todo's in it's wake, GLX is quite complicated, and the build options
involved are many, and the goal at the moment is to get dri and gallium
drivers building.

v2: - fix typo "vaule" -> "value"
    - put the not on the correct element of the conditional
    - Put correct description of dri3 option in this patch not the next
      one (Eric A)
    - fix non glvnd version (Eric A)
    - build glx tests
    - move loader include variables to this patch (Eric A)
v3: - set the version correctly for GL_LIB_NAME in libglx
v4: - set pkgconfig private fields

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
3218056e0e meson: Build i965 and dri stack
This gets pretty much the entire classic tree building, as well as
i965, including the various glapis. There are some workarounds for bugs
that are fixed in meson 0.43.0, which is due out on October 8th.

I have tested this with piglit using glx.

v2: - fix typo "vaule" -> "value"
    - use gtest dep instead of linking to libgtest (rebase error)
    - use gtest dep instead of linking against libgtest (rebase error)
    - copy the megadriver, then create hard links from that, then delete
      the megadriver. This matches the behavior of the autotools build.
      (Eric A)
    - Use host_machine instead of target_machine (Eric A)
    - Put a comment in the right place (Eric A)
    - Don't have two variables for the same information (Eric A)
    - Put pre_args at top of file in this patch (Eric A)
    - Fix glx generators in this patch instead of next (Eric A)
    - Remove -DMESON hack (Eric A)
    - add sha1_h to mesa in this patch (Eric A)
    - Put generators in loops when possible to reduce code in
      mapi/glapi/gen (Eric A)
v3: - put HAVE_X11_PLATFORM in this patch

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
86eb09a136 meson: de-tabularize meson_options.txt
This ends up being unworkable as more options get added, and with
description wrapped onto a new line it doesn't improve readability
anyway.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
97aea7d507 meson: only require libelf if building radv
And add a todo about clover, r600, and radeonsi, which also need libelf.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
001b65a899 meson: add nir_linking_helpers.c to libnir
This was missed in a rebase, and doesn't affect radv or anv, only i965.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
fc48ad2427 make: Fix test to be meson compatible
This has the same problem as the previous commit, generated headers and
hardcoded paths.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
1b1bb6ee10 make: Don't traverse backwards through include directories.
Traversing back through includes is bad idea and should be avoided.
In the case here - indirect_size.h is located in the build directory
$(top_builddir)/src/glx/.

v3: - Update commit message with message provided by Emil

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Dylan Baker
e5866af123 editorconfig: Add meson configuration
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:40:50 -07:00
Christian Gmeiner
148604fe75 etnaviv: call util_query_clear_result(..) in the generic layer
Saves us from calling util_query_clear_result(..) in every query
type implementation.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-09 22:19:47 +02:00
Christian Gmeiner
b22bacc6cf etnaviv: push query active handling into generic layer
We want the same active handling for every query type. So lets
handle it in the generic layer.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-10-09 22:19:31 +02:00
Dave Airlie
bee61d16c8 r600: drop a bunch of post-cayman code. (v2)
Now that Marek has split the two drivers apart, drop a bunch
of unnecessary code from the r600 half. There is probably a bunch
more hiding in the video code.

No piglit regressions on caicos.

v2: fix HAVE_LLVM protected code
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-10 06:08:42 +10:00
Marek Olšák
7b697c8b78 amd: move r600d_common.h into r600g
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:06 +02:00
Marek Olšák
76997e9133 radeonsi: shrink r600d_common.h and stop using it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:05 +02:00
Marek Olšák
0ecf9b90ef radeonsi: import cayman_msaa.c from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:04 +02:00
Marek Olšák
345f04ed92 radeonsi: remove r600_emit_reloc
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:02 +02:00
Marek Olšák
da61946cb1 radeonsi: merge si_set_streamout_targets with si_common_set_streamout_targets
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:27:00 +02:00
Marek Olšák
a86c9328ce radeonsi: add si_so_target_reference
The src type is different on purpose.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:58 +02:00
Marek Olšák
65f2e33500 radeonsi: import r600_streamout from drivers/radeon
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:26:55 +02:00
Marek Olšák
ed7f27ded8 radeonsi: add performance thresholds for CP DMA, decrease it for clears
The first one isn't used yet.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:24:21 +02:00
Marek Olšák
8e969cce38 radeonsi: disable primitive binning on Vega10 (v2)
Our driver implementation is known to decrease performance for some tests,
but we don't know if any apps and benchmarks (e.g. those tested by Phoronix)
are affected. This disables the feature just to be safe.

Set this to enable partial primitive binning:
    R600_DEBUG=dpbb
Set this to enable full primitive binning:
    R600_DEBUG=dpbb,dfsm

v2: add new debug options

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:18 +02:00
Marek Olšák
3784ce9782 radeonsi: enumerize DBG flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-09 16:20:16 +02:00
Marek Olšák
99fa9ccf96 drirc: whitelist glthread for Spec Ops: The Line
On i7 4790k and a 280X, there is a boost of about 10% more FPS.

Nominated by John Ettedgui.
2017-10-09 15:43:33 +02:00
Samuel Pitoiset
7824cb4b03 radv: configure VGT_VERTEX_REUSE at pipeline creation
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:19 +02:00
Samuel Pitoiset
b09b43b166 radv: do not need to zero-init ds/raster states
Already done when creating the pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:17 +02:00
Samuel Pitoiset
d4652e7c86 radv: remove unused fields in radv_raster_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:06:15 +02:00
Samuel Pitoiset
6732a8369a radv: set ALPHA_TO_MASK_ENABLE at blend state init
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:05:06 +02:00
Samuel Pitoiset
5848565ee3 radv: emit PA_SU_POINT_{SIZE,MINMAX} in si_emit_config()
These registers don't change during the lifetime of the
command buffer, there is no need to re-emit them when
binding a new pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:05:04 +02:00
Samuel Pitoiset
aab1537568 radv: allow launching waves out-of-order for compute
Ported from RadeonSI, and -pro seems to enable it as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-09 10:04:17 +02:00
Jason Ekstrand
6c7720ed78 anv/wsi: Allocate enough memory for the entire image
Previously, we allocated memory for image->plane[0].surface.isl.size
which is great if there is no compression.  However, on BDW, we can do
CCS_D on X-tiled images so we also have to allocate space for the
auxiliary buffer.  This fixes hangs in some of the WSI CTS tests and
should also reduce hangs in real applications.  In particular, it fixes
the dEQP-VK.wsi.*.incremental_present.* test group.

When we hand the image off to X11 or Wayland, it will ignore the CCS
entirely which is ok because we do a resolve when it's transitioned to
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-07 17:12:38 -07:00
Lionel Landwerlin
e262845e37 anv: fix nir.h include
All over mesa we include "nir/nir.h", we should probably do the same
here. This fixes the meson build that was broken by the ycbcr series.

Thanks to Dylan for finding the issue.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f3e91e78a3 ("anv: add nir lowering pass for ycbcr textures")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-07 22:57:50 +01:00
Jason Ekstrand
49a6fb8474 spirv: Don't warn on the ImageCubeArray capability
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-07 14:52:03 -07:00
Kenneth Graunke
37e128b9b7 mesa: make glFramebuffer* check immutable texture level bounds
When a texture is immutable, we can't tack on extra levels
after-the-fact like we could with glTexImage. So check against that
level limit and return an error if it's surpassed.

This fixes:
KHR-GL45.geometry_shader.layered_fbo.fb_texture_invalid_level_number

(Based on a patch by Ilia Mirkin.)

Reviewed-by: Antia Puentes <apuentes@igalia.com> [imirkin v2]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 13:26:55 -07:00
Marek Olšák
5a47abb63e radeonsi: don't change viewport for blits, use window-space positions
The viewport state was an identity anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
76ef08f6ee radeonsi: set correct PA_SC_VPORT_ZMIN/ZMAX when viewport is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
13b6c1c031 radeonsi: minor cleanup of si_update_vs_writes_viewport_index
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
5f566faa46 radeonsi: don't save and restore vertex buffers and elements for u_blitter
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
69ccb9dae7 radeonsi: use new VS blit shaders (VS inputs in SGPRs)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
6a8401a94e radeonsi: add VS blit shader creation
no users yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f3fe6afba8 radeonsi: split declare_default_desc_pointers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
0a3b5a0232 gallium/u_blitter: let drivers decide which VS to use for draw_rectangle
This approach allows drivers to set their own vertex shader and skip
compilation of u_blitter vertex shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
a46bcf0a77 gallium/u_blitter: let drivers set the vertex elements state
radeonsi won't set it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
7f8af4624d gallium/u_blitter: remove blitter_context_priv::viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f84a63bc00 radeonsi: don't use util_draw_arrays_instanced in si_draw_rectangle
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
387590accb radeonsi: move si_draw_rectangle into si_state_draw.c
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
de810f8b84 radeonsi: remove wrappers si_decompress_xx_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
efd72b31cb gallium/radeon: remove r600_atom::num_dw
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Marek Olšák
f1eb9a9c27 gallium/radeon: remove old r600g code checking chip_class and family
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-07 18:26:35 +02:00
Mark Thompson
c4ed39f85b st/va: Implement vaExportSurfaceHandle()
This is a new interface in libva2 to support wider use-cases of passing
surfaces to external APIs.  In particular, this allows export of NV12 and
P010 surfaces.

v2: Convert surfaces to progressive before exporting them (Christian).

v3: Set destination rectangle to match source when converting (Leo).
    Add guards to allow building with libva1.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-and-Tested-by: Leo Liu <leo.liu@amd.com>
2017-10-07 10:15:14 -04:00
Roland Scheidegger
52b73caaf4 gallivm: don't use pabs intrinsic with llvm version >= 6
The intrinsic is gone, causing shader compilation to crash.
While here, also change the fallback code to match what llvm's auto-updater
of these intrinsics would do (except that there will still be zext/trunc
instructions in there), which should ensure that the sequence gets recognized
and fused back into a pabs in the end (I didn't test this, and it's possible
even the old sequence would get recognized, but I don't see a reason why we
shouldn't use the same sequence in any case).

Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-10-07 00:54:09 +02:00
Tim Rowley
9716c69e22 swr/rast: use proper alignment for debug transposedPrims
Causing a crash in ParaView waveletcontour.py test when
_DEBUG defined due to vector aligned copy with unaligned
address.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-10-06 13:27:39 -05:00
Lionel Landwerlin
0763f814d7 anv/cmd_buffer: Reset state in cmd_buffer_destroy
This ensures that everything gets cleaned up properly. In particular,
it fixes a memory leak where we were leaking the push constants
structs.

Valgrind stats on
dEQP-VK.pipeline.push_constant.graphics_pipeline.range_size_128 :

Before:
HEAP SUMMARY:
    in use at exit: 2,467,513 bytes in 1,305 blocks
  total heap usage: 697,853 allocs, 696,530 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 1,068 bytes in 11 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

After:
HEAP SUMMARY:
    in use at exit: 2,467,381 bytes in 1,304 blocks
  total heap usage: 697,853 allocs, 696,531 frees, 138,466,600 bytes allocated

LEAK SUMMARY:
   definitely lost: 936 bytes in 10 blocks
   indirectly lost: 24,669 bytes in 412 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 2,441,776 bytes in 882 blocks
        suppressed: 0 bytes in 0 blocks

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
2017-10-06 17:32:34 +01:00
Lionel Landwerlin
d296dea54e anv/cmd_buffer: fix push descriptors with set > 0
When writing to set > 0, we were just wrongly writing to set 0. This
commit fixes this by lazily allocating each set as we write to them.

We didn't go for having them directly into the command buffer as this
would require an additional ~45Kb per command buffer.

v2: Allocate push descriptors from system memory rather than in BO
    streams. (Lionel)

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Fixes: 9f60ed98e5 ("anv: add VK_KHR_push_descriptor support")
Reported-by: Daniel Ribeiro Maciel <daniel.maciel@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 17:32:13 +01:00
Lionel Landwerlin
b24b93d584 anv: enable VK_KHR_sampler_ycbcr_conversion
v2: Make GetImageMemoryRequirements2KHR() iterate over all pInfo
    structs (Lionel)
    Handle VkSamplerYcbcrConversionImageFormatPropertiesKHR (Andrew/Jason)
    Iterator over BindImageMemory2KHR's pNext structs correctly (Jason)

v3: Revert GetImageMemoryRequirements2KHR() change from v2 (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:34:04 +01:00
Lionel Landwerlin
a62a979335 anv: enable multiple planes per image/imageView
This change introduce the concept of planes for image & views. It
matches the planes available in new formats.

We also refactor depth & stencil support through the usage of planes
for the sake of uniformity. In the backend (genX_cmd_buffer.c) we have
to take some care though with regard to auxilliary surfaces.
Multiplanar color buffers can have multiple auxilliary surfaces but
depth & stencil share the same HiZ one (only store in the depth
plane).

v2: by Jason
    Remove unused aspect parameters from anv_blorp.c
    Assert when attempting to resolve YUV images
    Drop redundant logic for plane offset in make_surface()
    Rework anv_foreach_plane_aspect_bit()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:20 +01:00
Jason Ekstrand
185e719090 anv: Take an image in can_sample_with_hiz
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
558d8a3979 anv: Take a single aspect in anv_layout_to_aux_usage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
3735af0415 anv/cmd_buffer: Make get_fast_clear_state return an address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Jason Ekstrand
fd146e4f3f anv/blorp: Add a concept of default aux usage
A good chunk of anv_blorp just wants the aux usage from the image.  This
magic aux_usage value means just that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
f3e91e78a3 anv: add nir lowering pass for ycbcr textures
This pass implements all the implicit conversions required by the
VK_KHR_sampler_ycbcr_conversion specification.

It also inserts plane sources onto sampling instructions that we then
let the pipeline layout pass deal with, when mapping things correctly
to descriptors.

v2: Add new file to meson build (Lionel)
    Use nir_frcp() rather than (1.0f / x) (Jason)
    Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason)
    Return progress (Jason)
    Account for array of samplers (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
3492d56067 anv: prepare sampler emission code for multiplanar images
New settings from the KHR_sampler_ycbcr_conversion specifications
might require different sampler settings for luma and chroma planes.
This change makes the sampler table emission ready to handle multiple
planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
a2a7846d37 anv/apply_pipeline_layout: Prepare for multi-planar images
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 16:32:19 +01:00
Lionel Landwerlin
72aec2060f anv: add new formats KHR_sampler_ycbcr_conversion
Adding new downsampling factors for each planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:46:08 +01:00
Lionel Landwerlin
bbc3700798 anv: modify the internal concept of format to express multiple planes
A given Vulkan format can now be decomposed into a set of planes. We
now use 'struct anv_format_plane' to represent the format of those
planes.

v2: by Jason
    Rename anv_get_plane_format() to anv_get_format_plane()
    Don't rename anv_get_isl_format()
    Replace ds_fmt() by fmt2()
    Introduce fmt_unsupported()

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:46:03 +01:00
Lionel Landwerlin
18914715d1 anv: prepare formats to handle disjoints sets
Newer format enums start at offset 1000000000, making it impossible to
have them all in one table. This change splits the formats into sets
that we then access through indirection.

v2: rename format_extract to vk_to_anv_format (Chad/Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:56 +01:00
Lionel Landwerlin
42a8fd1670 isl: fill out layout descriptions for yuv formats
Some description was missing.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:52 +01:00
Lionel Landwerlin
f86c1b1595 isl: check whether a format is rgb if colorspace is yuv
Suggested by Chad.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:49 +01:00
Lionel Landwerlin
5e9f52ff4d isl: make format layout channels accessible by index
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:44 +01:00
Lionel Landwerlin
c90e50f3a0 vulkan: util: add macros to extract extension/offset number from enums
v2: Simplify offset enum computation (Jason)

v3: capitalize macros (Chad)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 11:45:41 +01:00
Samuel Pitoiset
c8ea55ddda radv: convert all COMPUTE operations to the RADV_META_SAVE_XXX flags
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:06 +02:00
Samuel Pitoiset
213f86e514 radv: add RADV_META_SAVE_COMPUTE_PIPELINE flag
This will allow use to merge the compute save/restore helpers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:06 +02:00
Samuel Pitoiset
ba3dc3519d radv: add radv_meta_save() helper
And merge radv_meta_save_novertex() with
radv_meta_save_graphics_reset_vport_scissor_novertex().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:06 +02:00
Samuel Pitoiset
8d91f4e45f radv: merge radv_meta_{save,restore}_pass() with RADV_META_SAVE_PASS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:06 +02:00
Samuel Pitoiset
55ee532932 radv: convert all GFX operations to the RADV_META_SAVE_XXX flags
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:05 +02:00
Samuel Pitoiset
807f2d4f33 radv: introduce the concept of meta save flags
This will allow us to save/restore the different states on-demand
based on the meta operation. For now, this saves/restores all
states. Compute will follow once the graphics part is done.

The main idea is to merge all save/restore helpers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:05 +02:00
Samuel Pitoiset
a3a497c921 radv: remove unused RADV_META_VERTEX_BINDING_COUNT
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:05 +02:00
Samuel Pitoiset
b269ed3d94 radv: select the pipeline outside of the loop when decompressing htile
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:05 +02:00
Samuel Pitoiset
507df35939 radv: add radv_htile_enabled() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-06 09:49:05 +02:00
Tapani Pälli
0351638284 i965: pass wanted format to intel_miptree_create_for_dri_image
Change b3a44ae7a4 caused regressions on Android where DRI and renderbuffer
can disagree on the format being used. This patch removes the colorspace
parameter and instead we pass renderbuffer format. For non-winsys images we
still do srgb/linear modification in same manner as change b3a44ae7a4 wanted
but take format from renderbuffer instead of DRI image.

This patch fixes regressions seen with following test sets:

   dEQP-EGL.functional.color_clears*
   dEQP-EGL.functional.render*

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102999
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-06 08:06:13 +03:00
Marek Olšák
c4d1a199f8 radeonsi: add a drirc workaround for HTILE corruption in ARK: Survival Evolved
v2: use DB_META | PS_PARTIAL_FLUSH

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-10-06 02:56:11 +02:00
Marek Olšák
15d918e46f radeonsi: inline struct si_sampler_views
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
23cdde5138 radeonsi: rename si_textures_info -> si_samplers, si_images_info -> si_images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
3dfb375446 radeonsi: fold needs_*_decompress_mask update into si_set_sampler_view
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
bd5509d0a8 radeonsi: simplify a loop in si_update_fb_dirtiness_after_rendering
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
bcd3e761a3 ac: properly document a buffer.store LLVM workaround
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
cceb916456 radeonsi: use f32_0 and f32_1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
1516059ab1 radeonsi: fold *gallivm
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
e1b83c67da radeonsi: lp_type::length is always 1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
906ee3a3ba radeonsi: don't use bld.elem_type
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
723a23905f radeonsi: don't use lp_build_const_*
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
b4600b4740 radeonsi: use ctx->ac.context and ctx->types
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
d0751f6c1f radeonsi: use ctx->ac.builder
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
82dc72c8bd radeonsi: use ctx->i/f32 types more
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
dcbd3d470c radeonsi: use i32_0 and i32_1 more
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
bacdf5a928 radeonsi: use bitcast in a few places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
ad7305aa96 radeonsi: use ac helpers for bitcasts
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
9a88580a4b glsl_to_tgsi: skip UARL for 1D registers if the driver doesn't need it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
1bf1bfc12a glsl_to_tgsi: handle reladdr as TEMP in rename_temp_registers and dead_code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
8602c6a326 glsl_to_tgsi: each reladdr object should have only one parent
required by rename_temp_registers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
985338e2cb glsl_to_tgsi: fix instruction order for bindless textures
We emitted instructions loading the bindless handle after the memory
instruction.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
bd1837471a glsl_to_tgsi: enable copy propagation for tessellation shaders
just don't propagate output reads

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
dbe16d7537 radeonsi: implement PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
44993bd26f radeonsi: use si_get_indirect_index for TEMP indexing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
e986a16c16 radeonsi: use si_get_indirect_index for CONST indexing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
212c612a63 tgsi/ureg: allow any register file in address operands
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
41b85158ab gallium: add PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
cb686a340f tgsi/scan: scan address operands (v2)
v2: set swizzled usage mask

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
37714c6df2 tgsi/scan: set correct usage mask for tex offsets in scan_src_operand
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
5cc779197c tgsi/scan: take advantage of already swizzled usage mask in scan_src_operand
It has always been a usage mask *after* swizzling.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
ea85b76519 tgsi/scan: set non-valid src_index for tex offsets in scan_src_operand
tex offsets are not "Src" operands.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
be3ab867bd tgsi: implement tgsi_util_get_inst_usage_mask properly
All opcodes are handled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Marek Olšák
bb8abc10bf tgsi: add docs for some existing pack opcodes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-06 02:56:11 +02:00
Bas Nieuwenhuizen
4ffb9890ef radv: Enable VK_KHR_maintenance2 extension.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-06 01:41:29 +02:00
Bas Nieuwenhuizen
0c90ca7d37 radv: Make tess winding order a bit more intuitive.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-06 01:41:29 +02:00
Bas Nieuwenhuizen
c62afd094d radv: Allow setting the domain origin in tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-06 01:41:29 +02:00
Bas Nieuwenhuizen
ca21634632 radv: Disable usage checks in metadata for images with extended usage data.
The app can extend the usage, so knowing that the usage is limitied
does not help us here.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-06 01:41:29 +02:00
Bas Nieuwenhuizen
f800d91019 radv: Implement querying the point clipping behavior.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-06 01:41:29 +02:00
Daniel Stone
bbe2082e7d broadcom: Fix out-of-tree build include path
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
2017-10-05 15:03:11 -07:00
Bas Nieuwenhuizen
908a25ecb0 meson: generate builddir/src/amd/vulkan/dev_icd.json
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-05 23:46:21 +02:00
Kenneth Graunke
18bdf73556 mesa: Use a 565 format for GL_RGB and GL_UNSIGNED_SHORT_5_6_5 textures.
Found while trying to optimize an application.

Not observed to help performance on i965, but should at least reduce
the memory usage of such textures a bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
2017-10-05 14:30:47 -07:00
Jason Ekstrand
7463d50580 intel/compiler: Don't propagate cmod into integer multiplies
No shader-db change on Sky Lake.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-05 11:54:49 -07:00
Jason Ekstrand
b91ecee04a intel/compiler: Don't cmod propagate into a saturated operation
Shader-db results on Sky Lake:

    total instructions in shared programs: 12954445 -> 12955125 (0.01%)
    instructions in affected programs: 141862 -> 142542 (0.48%)
    helped: 0
    HURT: 626

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-05 11:54:49 -07:00
Derek Foreman
17d78ece36 broadcom/vc4: Don't advertise tiled dmabuf modifiers if we can't use them
If the DRM_VC4_GET_TILING ioctl isn't present then we can't tell
if a dmabuf bo is tiled or linear, so will always assume it's
linear.

By not advertising tiled formats in this situation we ensure the
assumption is correct.

This fixes a bug where most attempts to render a gl wayland client
under weston will result in a client side abort.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Stone <daniels@collabora.com> (on irc)
2017-10-05 11:26:14 -07:00
Adam Jackson
b174a1ae72 egl: Simplify the "driver" interface
"Driver" isn't a great word for what this layer is, it's effectively a
build-time choice about what OS you're targeting. Despite that both of
the extant backends totally ignore the display argument, the old code
would only set up the backend relative to a display.

That causes problems! One problem is it means eglGetProcAddress can
generate X or Wayland protocol when it tries to connect to a default
display so it can call into the backend, which is, you know, completely
bonkers. Any other EGL API that doesn't reference a display, like
EGL_EXT_device_query, would have the same issue.

Fortunately this is a problem that can be solved with the delete key.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-10-05 13:43:34 -04:00
Thomas Hellstrom
15e208c4cc loader/dri3: Don't accidently free buffer holding new back content
Avoid freeing buffers holding new back content
(with GLX_SWAP_COPY_OML and GLX_SWAP_EXCHANGE_OML)
Prevously that would have resulted in back buffer content becoming
incorrect after a swap, although I haven't managed to trigger such a
situation yet.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-10-05 09:17:12 +02:00
Thomas Hellstrom
1b8e0bed69 loader/dri3: Avoid resizing existing buffers in dri3_find_back_alloc
Resize only in loader_dri3_get_buffers(),
where the dri driver has a chance to immediately update the viewport.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-10-05 09:17:12 +02:00
Thomas Hellstrom
622f5e1d9b loader/dri3: Use local blits and local buffers when resizing
When a drawable is resized, and we fill the resized buffers, with data
from the old buffers, use a local blit if there is a local buffer (back or
fake front), and we have local blitting capability.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-10-05 09:17:12 +02:00
Ben Crocker
1359af930e gallivm/ppc64le: allow environmental control of Altivec code generation
In check_os_altivec_support(), allow control of Altivec (first PPC vector
instruction set) code generation via a new environmental control,
GALLIVM_ALTIVEC, which is expected to take on a value of 1 or 0.
The default is to enable Altivec code generation.

This environmental control of Altivec code generation is initially
available only #ifdef DEBUG.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2017-10-05 02:14:14 +02:00
Ben Crocker
e93f056a4e gallivm/ppc64le: adjust VSX code generation control.
In lp_build_create_jit_compiler_for_module(), advance the minimum
version of LLVM for VSX code generation to 4.0; this is the minimum
revision at which several known VSX code generation bugs are fixed:

  https://llvm.org/bugs/show_bug.cgi?id=25503 (fixed in 3.8.1)
  https://llvm.org/bugs/show_bug.cgi?id=26775 (fixed in 3.8.1)
  https://llvm.org/bugs/show_bug.cgi?id=33531 (fixed in 4.0)

An llc performance bug introduced in LLVM 4.0,

  https://llvm.org/bugs/show_bug.cgi?id=34647

is still pending as of LLVM 5.0, but only has a pronounced effect on
one of the Piglit tests: ext_transform_feedback-max-varyings.

All changes tested via Piglit.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2017-10-05 02:13:47 +02:00
Ben Crocker
5c75f0c8bb gallivm: allow additional llc options
In init_native_targets, allow the passing of additional options to
the LLC compiler via new GALLIVM_LLC_OPTIONS environmental control.
This option is available only #ifdef DEBUG, initially.
At top, add #include <llvm-c/Support.h> for LLVMParseCommandLineOptions()
declaration.

v2: Fix compile error with old llvm versions (sroland)

Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-05 02:06:46 +02:00
Ben Crocker
3a9feb4db8 gallivm: fix typo in debug_printf message
In gallivm_compile_module, fix a typo in the
debug_printf("Invoke as \"llc ..." message.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-10-05 01:48:37 +02:00
Samuel Pitoiset
8196a3c63e radv: remove useless checks around radv_CmdBindPipeline()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-04 23:18:51 +02:00
Samuel Pitoiset
b53c207659 radv: check that pipeline is different before binding it
We only need to dirty the descriptors when the pipeline is
a new one, because user SGPRs can be potentially different.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-04 23:18:48 +02:00
Matt Turner
2572c2771d i965: Validate "Special Requirements for Handling Double Precision Data Types"
I did not implement:

   CNL's restriction on 64-bit int + align16, because I don't think
   we'll ever use this combination regardless of hardware generation.

   The restriction on immediate DF -> F conversions, because there's no
   reason to ever generate that, and I don't even know how DF -> F
   conversions are supposed to work in Align16 since (1) the dst stride
   must be 1, but (2) the dst stride would have to be 2 for src and dst
   strides to be aligned.
2017-10-04 14:08:54 -07:00
Matt Turner
98298c7e3d i965: Fix and enable forgotten validation test
I seem to have forgotten I still had work to do.
2017-10-04 14:08:54 -07:00
Matt Turner
122ef3799d i965: Only insert error message if not already present
Some restrictions require something like strides to match between src
and dest. For multi-source instructions, I'd rather encapsulate the
logic for not inserting already present errors in ERROR_IF than
open-coding it multiple places.
2017-10-04 14:08:54 -07:00
Matt Turner
5e76cf153c i965: Avoid validation error when src1 is not present
There can be no violation of the restriction that source offsets are
aligned if there is only one source offset.
2017-10-04 14:08:54 -07:00
Matt Turner
cacc229ba0 i965: Remove validate_reg()
Replaced by the assembly validator, and in fact gets in the way of
writing tests for the assembly validator.
2017-10-04 14:08:54 -07:00
Matt Turner
678d88bcee i965: Add and use STRIDE and WIDTH macros
You'll notice there were bugs in some of the code being replaced.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-04 14:08:54 -07:00
Matt Turner
4c961a5e79 i965: Add parentheses around usage of macro arguments
Otherwise I cannot use this macro in test_eu_validate.cpp

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-04 14:08:54 -07:00
Matt Turner
1fcdb1cbea i965: Add GLK, CFL, CNL to test_eu_validate.c 2017-10-04 14:08:54 -07:00
Matt Turner
d4c39e9cff i965: Add Atom graphics names to parse_devid_override() 2017-10-04 14:08:54 -07:00
Matt Turner
6db5ec7deb i965: Fix support for disassembling 64-bit integer immediates
The type suffixes were wrong, and the 16 was missing the 0 prefix.

Fixes: 92f787ff86 ("i965: Add support for disassembling 64-bit integer immediates")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-04 14:08:54 -07:00
Matt Turner
7e88f93469 i965/fs: Rewrite fsign64 to skip the float -> double conversion
... without the float -> double conversion. Low power parts have
additional restrictions when it comes to operating on 64-bit types, and
the instruction used to do the conversion violates one of them:
specifically, the restriction that "Source and Destination horizontal
stride must be aligned to the same qword".

Previously we generated a float and then converted, but we can avoid the
conversion by using the same extract-the-sign-bit + or-in-1.0 algorithm
by directly operating on the high four bytes of each double-precision
component in the result.

In SIMD8 and SIMD16 this cuts one instruction from the implementation,
and more importantly that instruction is the one which violated the
regioning restriction.

Along the way I removed some comments that I did not think helped, and
some code about double comparisons which does not seem to be necessary
today.

This prevents validation failures caught by the new EU validation code
added in later patches.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-10-04 14:08:54 -07:00
Matt Turner
b541945c20 i965/fs: Unpack count argument to 64-bit shift ops on Atom
64-bit operations on Atom parts have additional restrictions over their
big-core counterparts (validated by later patches).

Specifically, the restriction that "Source and Destination horizontal
stride must be aligned to the same qword" is violated by most shift
operations since NIR uses a 32-bit value as the shift count argument,
and this causes instructions like

   shl(8)          g19<1>Q         g5<4,4,1>Q      g23<4,4,1>UD

where src1 has a 32-bit stride, but the dest and src0 have a 64-bit
stride.

This caused ~4 pixels in the ARB_shader_ballot piglit test
fs-readInvocation-uint.shader_test to be incorrect. Unfortunately no
ARB_gpu_shader_int64 test hit this case because they operate on
uniforms, and their scalar regions are an exception to the restriction.

We work around this by effectively unpacking the shift count, so that we
can read it with a 64-bit stride in the shift instruction. Unfortunately
the unpack (a MOV with a dst stride of 2) is a partial write, and cannot
be copy-propagated or CSE'd.

Bugzilla: https://bugs.freedesktop.org/101984
2017-10-04 14:08:54 -07:00
Matt Turner
2082c32950 i965/fs: Don't apply POW/FDIV workaround on Gen10+
The documentation says it applies only to Gens 8 and 9.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-10-04 14:08:37 -07:00
Matt Turner
d407935327 i965: Fix src0 vs src1 typo
A typo caused us to copy src0's reg file to src1 rather than reading
src1's as intended. This caused us to fail to compact instructions like

   mov(8)   g4<1>D    0D              { align1 1Q };

because src1 was set to immediate rather than architecture file. Fixing
this reenables compaction (after the precompact() pass changes the data
types):

   mov(8)   g4<1>UD   0x00000000UD    { align1 1Q compacted };

Fixes: 1cb0a7941b ("i965: Switch to using the logical register types")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-04 14:08:24 -07:00
Dave Airlie
ad3d98da9f radv: enable tc compatible htile for d32s8 also.
This enables tc compatible htile for stencil surfaces as well.

This gives a 3-5fps boost on Mad Max on high@4k.

It also depends on Bas's tc-compat htile patch.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-04 21:02:23 +01:00
Samuel Pitoiset
844ae722c4 radv: dump SPIRV when a GPU hang is detected
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-04 19:37:08 +02:00
Samuel Pitoiset
a2a350a3be radv: dump NIR when a GPU hang is detected
This looks a bit ugly to me, but the existing codepath
is not terribly elegant as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-04 19:37:08 +02:00
Marek Olšák
94d800bfa3 ac: silence a warning 2017-10-04 17:00:05 +02:00
Daniel Stone
b65d6dafd6 egl/wayland: Don't use dmabuf with no modifiers
The dmabuf interface requires a valid modifier to be sent. If we don't
explicitly get a modifier from the driver, we can't know what to send;
it must be inferred from legacy side-channels (or assumed to linear, if
none exists).

If we have no modifier, then we can only have a single-plane format
anyway, so fall back to the old wl_drm buffer import path.

Fixes: a65db0ad1c ("st/dri: don't expose modifiers in EGL if the driver doesn't implement them")
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
2017-10-04 15:17:46 +01:00
Daniel Stone
6273d2f269 egl/wayland: Check queryImage return for wl_buffer
When creating a wl_buffer from a DRIImage, we extract all the DRIImage
information via queryImage. Check whether or not it actually succeeds,
either bailing out if the query was critical, or providing sensible
fallbacks for information which was not available in older DRIImage
versions.

Fixes: a65db0ad1c ("st/dri: don't expose modifiers in EGL if the driver doesn't implement them")
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Cc: Marek Olšák <marek.olsak@amd.com>
2017-10-04 15:17:46 +01:00
Eric Engestrom
d246aa3a0d travis: move include path from $CC to $CFLAGS
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-04 15:02:37 +01:00
Tobias Klausmann
80bfff5c4f wayland-egl: adds CFLAGS for wayland.egl.h include
Starting with commit ab0589c6ed ("wayland-egl: remove no longer needed
wayland-client dependency") the wayland-egl.h include was missing leading to a
build failure:

  CC       wayland-egl.lo
wayland-egl.c:33:10: fatal error: wayland-egl.h: No such file or directory
 #include "wayland-egl.h"
          ^~~~~~~~~~~~~~~

Strictly speaking we should be checking for wayland-egl in configure and
propagating its CFLAGS here.

Yet again, the current wayland-egl split is bonkers as the Wayland repo
provides single header, no pkg-config file or library.

That will be resolved at a later stage, but in the meanwhile fix the
build.

Fixes: ab0589c6ed ("wayland-egl: remove no longer needed wayland-client
dependency")
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil Velikov: add some text about CFLAGS and current wayland-egl situation]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-04 14:25:06 +01:00
Emil Velikov
fd404e3c17 automake: add texcompress_s3tc_tmp.h to the sources list
Otherwise it will be missing from the tarball.

Fixes: f7daa737d1 ("mesa: Combine libtxc_dxtn sources into
texcompress_s3tc_tmp.h")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-04 14:21:40 +01:00
Leo Liu
409491e778 st/va: add RGB support to vlVaPutSurface
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-04 09:22:33 -04:00
Leo Liu
0fa950ecd3 st/va: don't re-allocate interlaced buffer with pakced format
It caused corruption, when vlVaPutImage putting raw data to the fields

v2: add RGB formats since it got uploaded here as well

Cc: mesa-stable@lists.freedesktop.org
Cc: Andy Furniss <adf.lists@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-04 09:22:33 -04:00
Leo Liu
327480d10f st/vdpau: don't re-allocate interlaced buffer with packed YUV format
It caused corruption, when vlVdpVideoSurfacePutBitsYCbCr putting YUV to the fields

Cc: mesa-stable@lists.freedesktop.org
Cc: Andy Furniss <adf.lists@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-10-04 09:22:33 -04:00
Bas Nieuwenhuizen
ae61fe4982 radv: Implement TC compatible HTILE.
The situations where we enable it are quite limitied, but it works,
even for madmax, so lets just enable it.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-04 09:19:26 +02:00
Dave Airlie
4e93d6baae radv: emit fmuladd instead of fma to llvm.
For Vulkan SPIR-V the spec states
fma() Inherited from OpFMul followed by OpFAdd.

Matt says the backend will do the right thing depending on the
hardware being compiled for, if you use the fmuladd intrinsic.

Using the Mad Max pts test, on high settings at 4K:
CHP: 55->60
HGDD: 46->50
LM: 55->60
No change on Stronghold.

Thanks to Feral for spending the time to track this down.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-04 06:22:44 +01:00
Tapani Pälli
b2dce27373 android: fix build issues with brw_nir_trig_workarounds.c
Fixes: 848da66222 ("intel: use a flag instead of setting PYTHONPATH")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-04 07:39:05 +03:00
Lionel Landwerlin
d3acc240d0 intel: compiler: vec4: add missing default 0 lod
We set a similar default value for LOD in the fs backend for TXS/TXL.
Without this we end up generating invalid MOV with a null src.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-10-03 22:50:46 +01:00
Jason Ekstrand
8733567e05 anv: Remove base_vertex/instance from push_constants
This is just legacy cruft.  We don't push these values; we pass them in
as vertex attributes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-03 13:41:00 -07:00
Brian Paul
e4c7a2ab68 util: include string.h in u_string.h
To fix MinGW compiler warning about missing strlen() prototype.
Not sure how I missed this when fixing the malloc() / stdlib.h issue.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 14:33:00 -06:00
Brian Paul
33122e8a3d llvmpipe: silence 'variable may be used uninitialized' warnings
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 14:33:00 -06:00
Brian Paul
42eb3052c3 mesa: silence 'variable may be used uninitialized' warning in teximage.c
Found with MinGW optimized build.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 14:32:59 -06:00
Brian Paul
fed856478c mesa: silence 'variable may be used uninitialized' warning in bufferobj.c
Found with MinGW optimized build.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 14:32:59 -06:00
Brian Paul
980fe46d72 svga: wrap long comments in svga_tgsi_vgpu10.c
Trivial.
2017-10-03 12:07:05 -06:00
Brian Paul
362fb05b65 svga: tweak pre-VGPU10 rasterization offsets
It seems there's no perfect x/y biases for line drawing to satisfy all
applications.  Depending on the biases, either real apps produce results
similar to VGPU10 while Piglit's gl-1.0-ortho-pos fails, or vice versa.

Let's lean toward real applications (Solidworks, SolidEdge, Google Earth)
over Piglit.

Using (-0.5, -0.5) for points, lines and triangles, seems to generally
work well.

We don't seem to have these issues with VGPU10.

Tested with Piglit and CAD-oriented apitraces.  See VMware bugs 1775498
and 1905053.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 12:07:05 -06:00
Brian Paul
3e39abf6a0 svga: if we get nr_samples==1, store nr_samples=0
We need to be more careful not to treat nr_samples=1 as an msaa surface.
This patch prevents us from errantly declaring an MSAA shader resource
with 1 sample.

No Piglit regressions, fixes the above-described errors.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-10-03 12:07:05 -06:00
Charmaine Lee
3c71c42827 svga: emit sampler constants only if sampler view exists
It is possible to have holes in the shader emitter's sampler_target array.
0 sampler_target does not necessarily mean there is no sampler view
specified since texture buffer target has the value 0.
With this patch, a sampler_view array is added to the shader emitter structure
to specify if there is a sampler view for each texture unit. Only if there
is a sampler view, we will emit constant for texcoord scale factor or texture
buffer size for that sampler view.

Fixes a rendering issue with Turbine after commit 1020e960440.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-10-03 12:07:05 -06:00
Brian Paul
b7c08e5081 svga: fix incorrect case in svga_typeless_format()
For the case of SVGA3D_X32_G8X24_UINT we incorrectly returned
SVGA3D_R32_FLOAT_X8X24.  We should return SVGA3D_R32G8X24_TYPELESS.

Note that we never actually use SVGA3D_X32_G8X24_UINT so this has
no impact.

No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 12:07:05 -06:00
Brian Paul
cbe72ae598 svga: add typeless switch cases in svga_typeless_format()
We sometimes pass typeless formats to this function.  By adding switch
cases we avoid the "Unexpected format XXX in svga_typeless_format"
warning messages.  No functional change.

No Piglit regressions, no above-mentioned warning messages.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 12:07:05 -06:00
Neha Bhende
9a7d42b71c svga: Allow sRGB format with PIPE_BIND_DISPLAY_TARGET binding flag on vgpu10.
This patch allows to use sRGB formats for DISPLAY_TARGET on vgpu10.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-10-03 12:07:05 -06:00
Wladimir J. van der Laan
df6b320a83 etnaviv: Set up unknown GC3000 states
Set up new states that the blob started setting for GC3000 consistently.

This makes sure that when another test or driver leaves the GPU in
unpredictable state, these states are set up correctly for our
rendering.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-03 19:52:07 +02:00
Wladimir J. van der Laan
a2132fbd79 etnaviv: Fix point sprite rendering on GC3000
Setting PA_VIEWPORT_UNK state correctly is necessary to make point sprite
rendering on GC3000 work.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-03 19:51:52 +02:00
Wladimir J. van der Laan
ec254f4bfa etnaviv: Add support for DP2 instruction
A two-component dot product instruction is supported with HALTI2, use it
on hardware that supports it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-03 19:49:47 +02:00
Wladimir J. van der Laan
80f608b530 etnaviv: Support opcodes with bit 6 set in assembler
Support opcodes with bit 6 set in assembler, and assert that only ops
0x00..0x7f are used.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-10-03 19:49:38 +02:00
Dylan Baker
df82012b2c travis: add meson build for vulkan drivers.
v2: - use -isystem`pwd` instead of cp to include fake linux header
      (Eric E., Emil)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-03 10:02:08 -07:00
Dylan Baker
7a5a986ddd meson: convert gtest to an internal dependency
In truth gtest is an external dependency that upstream expects you to
"vendor" into your own tree. As such, it makes sense to treat it more
like a dependency than an internal library, and collect it's
requirements together in a dependency object.

v2: - include with -isystem instead of setting compiler args (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:02:08 -07:00
Dylan Baker
052c0d5eda meson: set C++ standard to C++11
RadeonSI requires C++11, clover requires C++11, LLVM requires it, so
llvmpipe may require it, and that covers most of the C++ code in mesa.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:02:04 -07:00
Dylan Baker
af867d72c6 meson: add window system deps to intel vulkan common
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:01:45 -07:00
Dylan Baker
cc4f587307 meson: look for libelf as a library if there is no pkgconfig
Required for older versions of libelf that don't have a pkgconfig file.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:01:45 -07:00
Gurchetan Singh
9d9a46d4ef egl/surfaceless: Use KMS swrast fallback
The kms_swrast extension is an actively developed software fallback,
and platform_surfaceless can use it if there are no available
hardware drivers.

v2: Split into 2 patches, use booleans, check LIBGL_ALWAYS_SOFTWARE,
    and modify the eglLog level (Emil, Eric, Tomasz).

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-03 17:56:15 +01:00
Gurchetan Singh
540c804297 egl/surfaceless: add probe device helper function
This will help us initialize a software driver, if it's needed
or requested.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-03 17:56:15 +01:00
George Kyriazis
a8e4a0f609 gallium/u_tests: fix ifdef for sync_file fences
include libsync.h only when libdrm is compiled in

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-03 11:50:48 -05:00
Brian Paul
2d4b57fc3e util: include stdlib.h in u_string.h to silence MinGW warning
Otherwise we don't get a prototype for malloc().

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-03 10:15:03 +01:00
Kenneth Graunke
bf15dc7a1b intel: Always set Cube Face Enables for all surfaces.
These shouldn't matter for non-cubes, and we always enable them all
for cubes, so we may as well set them all the time.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-03 00:06:34 -07:00
Kenneth Graunke
45cf049ba6 intel: Make Cube Face Enable fields consistent across generations.
I decided to use the one-boolean-per-cube-face approach because it's
clearer which bits correspond to which cube face.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-03 00:06:34 -07:00
Matt Turner
d400348c73 docs: Document that libtxc_dxtn is now no longer needed 2017-10-02 22:32:59 -07:00
Matt Turner
e057cda2ef docs: GL_ARB_indirect_parameters is now supported on i965/gen7+ 2017-10-02 22:32:59 -07:00
Matt Turner
40ef8362e5 travis: Remove libtxc_dxtn from the build
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
74b5568978 build: Remove HAVE_DLOPEN
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
8d02abd0fe mesa: Delete now unused dlopen.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
c17c47207b mesa: Remove force_s3tc_enable driconf variable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
dc546a7bb3 gallium: Remove util_format_s3tc_init()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
3a8a5e77e8 gallium: Remove util_format_s3tc_enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
f6c56e07fc mesa/st: Drop has_lib_dxtc argument from st_init_extensions()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
c5d5080284 mesa: Drop Mesa_DXTn from gl_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
78c6221f18 mesa: Drop function pointer checks in s3tc code
Now never null!

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
34cf3c43be mesa: Call DXTn functions directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
fb5338c4b7 mesa: Remove fprintf referring to libdxtn
When this file is included by Gallium, the fprintf causes it to fail to
compile. This is an unreachable error case, and we shouldn't be calling
fprintf directly.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
82c54c4fdc mesa: Remove prototypes and mark S3TC functions static
This file will be #included, so the functions should be static.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
7ce9999166 mesa: Remove commented-out DXTn fetch code
Has been disabled for 12 years.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
f7daa737d1 mesa: Combine libtxc_dxtn sources into texcompress_s3tc_tmp.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Matt Turner
04396a134f mesa: Import libtxc_dxtn sources
Imported from master (commit ef07298391c6dcad843e0b13e985090c1dd76e76)
of https://cgit.freedesktop.org/~mareko/libtxc_dxtn/

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 19:41:22 -07:00
Józef Kucia
14555d0b7a anv: Remove unreachable cases from isl_format_for_size()
The dstOffset and fillSize parameters must be multiple of 4.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
2017-10-03 00:43:06 +01:00
Józef Kucia
15fdbf9c39 anv: Fix vkCmdFillBuffer()
The vkCmdFillBuffer() command fills a buffer with an uint32_t value.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
2017-10-03 00:42:50 +01:00
Marek Olšák
2d62817da9 st/mesa: don't use pipe_surface for passing information about EGLImage
Use st_egl_image instead. radeonsi doesn't like when we create
a pipe_surface with PIPE_FORMAT_NV12.

This fixes NV12 texturing on radeonsi using kmscube.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-03 01:27:18 +02:00
Marek Olšák
d50ead53b8 gallium/u_tests: test sync_file fences
This should be sufficient for testing all kernel/libdrm/radeonsi codepaths
that are used by radeonsi.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-03 01:27:18 +02:00
Plamena Manolova
598d613dc3 i965: Implement ARB_indirect_parameters.
We can implement ARB_indirect_parameters for i965 by
taking advantage of the conditional rendering mechanism.
This works by issuing maxdrawcount draw calls and using
conditional rendering to predicate each of them with
"drawcount > gl_DrawID"

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 16:25:36 -07:00
Plamena Manolova
3fb8483565 i965: Refactor brw_try_draw_prims.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit refactors the brw_try_draw_prims function and
renames it to brw_draw_single_prim.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 16:25:36 -07:00
Plamena Manolova
646e112385 i965: Indroduce brw_finish_drawing.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit introduces the brw_finish_drawing function where
we move the code that executes once after the loop.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 16:25:36 -07:00
Plamena Manolova
c63c8f5135 i965: Introduce brw_prepare_drawing.
In order to add our ARB_indirect_parameters implementation we
need to refactor brw_try_draw_prims so that it operates on a
per primitive basis and move the loop into brw_draw_prims.
This commit introduces the brw_prepare_drawing function where
we move the code that executes once before the loop.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 16:25:36 -07:00
Ian Romanick
765e1fa372 glsl: Remove spurious assertions
It's inside an if-statement that already checks that the variables are
not NULL.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:11 -07:00
Ian Romanick
ff5254bf08 glsl: Move 'foo = foo;' optimization to opt_dead_code_local
The optimization as done in opt_copy_propagation would have to be
removed in the next patch.  If we just eliminate that optimization
altogether, shader-db results, even on platforms that use NIR, are hurt
quite substantially.  I have not investigated why NIR isn't picking up
the slack here.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-10-02 14:46:11 -07:00
Ian Romanick
623002f0b2 glsl/ast: Use logical-or instead of conditional assignment to set fallthru_var
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:11 -07:00
Ian Romanick
d5361d9f01 glsl/ast: Generate a more compact expression to disable execution of default case
Instead of generating a sequence like:

    run_default = true;
    if (i == 3) // some label that appears after default
        run_default = false;
    if (i == 4) // some label that appears after default
        run_default = false;
    ...
    if (run_default) {
        ...
    }

generate something like:

    run_default = !((i == 3) || (i == 4) || ...);
    if (run_default) {
        ...
    }

This eliminates one use of conditional assignment, and it enables the
elimination of another.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
3e5cd2aba9 glsl/ast: Explicitly track the set of case labels that occur after default
Previously the instruction stream was walked looking for comparisons
with case-label values.  This should generate nearly identical code.
For at least fs-default-notlast-fallthrough.shader_test, the code is
identical.

This change will make later changes possible.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
f307de2838 glsl/ast: Convert ast_case_label::hir to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
4a8086c5a5 glsl/ast: Use ir_binop_equal instead of ir_binop_all_equal
The values being compared are scalars, so these are the same.  While
I'm here, simplify the run_default condition to just deref the flag
(instead of comparing a scalar bool with true).

There is a bit of extra change in this patch.  When constructing an
ir_binop_equal ir_expression, there is an assertion that the types are
the same.  There is no such assertion for ir_binop_all_equal, so
passing glsl_type::uint_type with glsl_type::int_type was previously
fine.  A bunch of the code motion is to deal with that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
ed80746c1c glsl/ast: Stop processing a switch-statement after an error in the init-expression
This happens to work now because ir_binop_all_equal is used.  This
causes vector typed init-expressions to produce scalar Boolean values
after comparison.

The next commit changes ir_binop_all_equal to ir_binop_equal.  Vector
typed init-expressions will then produce vector Boolean values, and, in
debug builds, the ir_assignment constructor will fail an assertion.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
6d1765c63a glsl: Don't pass NULL to ir_assignment constructor when not necessary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
3cc997c7c8 glsl: Convert lower_variable_index_to_cond_assign to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
eb58668525 glsl: Fix coding standards issues in lower_variable_index_to_cond_assign
Mostly tabs-before-spaces, but there was some other trivium too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
acd8b86a76 glsl: Convert lower_vec_index_to_cond_assign to using ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
1f4fcdb2ca glsl: Return ir_variable from compare_index_block
This is basically a wash now, but it simplifies later patches that
convert to using ir_builder.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:01 -07:00
Ian Romanick
4d009455f3 glsl: Fix coding standards issues in lower_vec_index_to_cond_assign
Mostly tabs-before-spaces, but there was some other trivium too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:01 -07:00
Ian Romanick
425921afa3 glsl: Fix coding standards issues in lower_if_to_cond_assign
Mostly tabs-before-spaces issues.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:01 -07:00
Bas Nieuwenhuizen
ef61d09d5b nir/spirv: Allow loop breaks in a switch body.
Per the SPIR-V spec 2.11 Structured Control Flow:

"The only blocks in a construct that can branch outside the construct are

...
- a break block for the innermost loop it is inside of.
..."

With

"Break block: A block containing a branch to the Merge Block of a loop header's merge instruction."

Note that it puts no restriction on not being in an if or switch within the innermost loop.

This passes the loop_break block to the switch body so it can properly detect loop breaks.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-02 20:33:04 +02:00
Rob Clark
7f3eab03fe freedreno/a5xx: fix missing restore state
RB_CLEAR_CNTL seems to be in a funny state after boot (at least on
8x96/a530).

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-02 13:17:15 -04:00
Samuel Pitoiset
278679f09a radv: make radv_dynamic_state_copy() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 19:00:05 +02:00
Dylan Baker
e915b8d267 meson: change vulkan icd config to - instead of _
Just to be consistent.

v2: - update meson.build too
v3: - remove unrelated whitespace change

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-02 09:33:19 -07:00
Dylan Baker
9342a7d6d6 meson: check for python2 mako
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-10-02 09:33:19 -07:00
Juan A. Suarez Romero
86a82b6af9 docs: update calendar, add news item and link release notes for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-10-02 18:12:25 +02:00
Juan A. Suarez Romero
47ef8c8503 docs: add sha256 checksums for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 5a71ed6fa5)
2017-10-02 18:12:25 +02:00
Juan A. Suarez Romero
9e74ee2f3e docs: add release notes for 17.2.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit bc12538a8e)
2017-10-02 18:12:25 +02:00
Emil Velikov
677edff5cf wayland-egl: rework and simplify wl_egl_window initialization
Use calloc instead of malloc + explicitly zeroing the different fields.
We need special handling for the version field which is of type
const intptr_t.

As we're here document why keeping the constness is a good idea.

The wl_egl_window_resize() call is replaced with an explicit set of the
width/height.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
ebc51ff932 wayland-egl: move WL_EGL_EXPORT declaration to where it's used
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
0f8b0c04eb wayland-egl: use C99 comments
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
ab0589c6ed wayland-egl: remove no longer needed wayland-client dependency
Was required for wl_surface, which is opaque and forward declared with
earlier patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
5bd13d80fa wayland-egl: add stdint.h include for intptr_t
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
860deb4191 wayland-egl: forward declare struct wl_surface
It makes the header self-contained and with later commit we'll remove
the unnecessary wayland-client.h include.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
198af27c67 wayland-egl: rename wayland-egl-{priv,backend}.h
In preparation to lifting the whole thing out as a separate library.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Miguel A. Vico <mvicomoya@nvidia.com>
2017-10-02 16:29:38 +01:00
Emil Velikov
d884d8d007 egl/dri: link directly to libglapi.so
Shared glapi (libglapi.so) has been a requirement for years, in order
to build EGL.

Remove the no longer necessary dlopen/dlsym dance and link to the
library directly.

This allows us to remove a handful of platform specific workarounds, due
to the different name of the library.

v2:
 - Android: export the include dir (RobH)
 - Drop unused local variable (Eric)

Cc: Jonathan Gray <jsg@jsg.id.au>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Cc: Julien Isorce <julien.isorce@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Tomasz Figa <tfiga@chromium.org> (v1)
Tested-by: Rob Herring <robh@kernel.org>
2017-10-02 16:26:46 +01:00
Emil Velikov
21e271024d swr/rast: do not crash on NULL strings returned by getenv
The current convenience function GetEnv feeds the results of getenv
directly into std::string(). That is a bad idea, since the variable
may be unset, thus we feed NULL into the C++ construct.

The latter of which is not allowed and leads to a crash.

v2: Better variable name, implicit char* -> std::string conversion (Eric)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832
Fixes: a25093de71 ("swr/rast: Implement JIT shader caching to disk")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Bernhard Rosenkraenzer <bero@lindev.ch>
[Emil Velikov: make an actual commit from the misc diff]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Laurent Carlier <lordheavym@gmail.com> (v1)
2017-10-02 16:19:13 +01:00
Rob Clark
16ac70bdcf freedreno/a5xx: align height to GMEM
Similar to the way width/pitch alignment works, it seems like we need to
do similar for height.  Otherwise the BLIT from system memory to GMEM
can over-fetch beyond the end of the buffer, triggering a fault.

I'm not sure if there is a better solution yet.  Possibly we could fall
back to pre-a5xx style DRAW packets for cases where BLIT might over-
fetch.  (We in theory have that problem already with rendering to higher
mipmap levels, although fortunately those tend to use GMEM bypass.)

This fixes issues reported with glamor.

Reported-by: don.harbin@linaro.org
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-02 09:25:57 -04:00
Nicolai Hähnle
146c2b7c28 radeonsi: adjust clip discard based on line width / point size
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
63680471f9 radeonsi: remove si_context::{scissor_enabled,clip_halfz}
They are just copies of the rasterizer state.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
12f3155e28 radeonsi: simplify the signature of si_update_vs_writes_viewport_index
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
7bbcb6ac6c radeonsi: move current_rast_prim into si_context
v2: rebase fixes

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
6b416ec3d6 radeonsi: move and rename scissor and viewport state and functions
v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:45 +02:00
Nicolai Hähnle
449ac258d1 radeonsi: remove si_apply_scissor_bug_workaround
It only affects pre-SI chips.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
c955f45946 radeonsi: move r600_viewport.c to si_viewport.c
This is purely a file-move + #include fixup + build system changes.
Other cleanups will follow in subsequent commits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
30e37289ea radeonsi: fix maximum advertised point size / line width
The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.

Fixes dEQP-GLES3.functional.rasterization.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
a3fa3b2e02 radeonsi: deduce rast_prim correctly for tessellation point mode
Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
4d74432dd3 radeonsi: don't discard points and lines
This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:44 +02:00
Nicolai Hähnle
f86a112b07 radeonsi: move current_rast_prim to r600_common_context
We'll use it in the scissors / clip / guardband state.

v2: avoid a performance regression on r600 when applied to
    (pre-fork) stable branches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
6f83085ec0 st/mesa: use R10G10B10X2 format where applicable
This is the last step of fixing
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
for radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
85a3e1cae0 gallium: add PIPE_FORMAT_R10G10B10X2_UNORM
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
d2b60e433e mesa/main: R10G10B10_(A2) formats are not color renderable in ES
The EXT_texture_type_2_10_10_10_REV (ES only) states the following issue:

   "1. Should textures specified with this type be renderable?

    UNRESOLVED: No.  A separate extension could provide this functionality."

This partially fixes
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.{rgb,rgba}_unsigned_int_2_10_10_10_rev

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
f38b94285d mesa/main: select the R10G10B10X2_UNORM internal format based on data type
ES requires it. This is a partial fix for
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 15:07:43 +02:00
Nicolai Hähnle
fcae1a64ec glsl: do not set the 'smooth' qualifier by default on ES shaders
It leads to surprising states with integer inputs and outputs on
vertex processing stages (e.g. geometry stages). Instead, rely on the
driver to choose smooth interpolation by default.

We still allow varyings to match when one stage declares it as smooth
and the other declares it without interpolation qualifiers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-02 15:07:42 +02:00
Rob Clark
d304c467ba freedreno: fix PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
Fixes an assert in fd_acc_query_register_provider() about query provider
not already registered.

Fixes: 3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-10-02 08:44:57 -04:00
Eric Engestrom
c3f51526ac egl/wayland: simplify LIBGL_ALWAYS_SOFTWARE logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-02 13:10:48 +01:00
Nicolai Hähnle
6d23f7c65d radeonsi: fix a regression in integer cube map handling
A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.

Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.

Fixes: 6fb0c1013b ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 12:17:15 +02:00
Nicolai Hähnle
052b974fed amd/common: move ac_build_phi from radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-10-02 12:17:15 +02:00
Samuel Pitoiset
70f6b95862 radv: remove unused radv_meta_state::btoi::render_pass handle
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
6f1447c090 radv: do not check the number of levels when doing fast htile
We shouldn't reach this point because HTILE is only enabled
when the number of levels is 1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
06dbe0722f radv: cleanup radv_device_finish_meta_XXX() helpers
Unnecessary to double check that handles are not NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
2084629b63 radv: select the pipeline outside of emit_fast_clear_flush()
It can't change during the decompression pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
331a4f885a radv: drop useless param in emit_depth_decomp()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
87f4e432e3 radv: drop useless check in depth_view_can_fast_clear()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
689930f670 radv: add radv_subpass_clear_attachment() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
a821771c56 radv: add radv_attachment_needs_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
0a208122d7 radv: remove unused param in radv_handle_{cmask,dcc}_image_transition()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
db2e68b66b radv: add radv_vi_dcc_enabled() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
457306fa4c radv: do not need to double zero-init the meta state structures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
af62984c8a radv: inline destroy_render_pass()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
84635ef3a3 radv: use pipeline handles instead of objects for meta clear operations
To be consistent with other meta operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
a5f76d259b radv: inline blit2d_unbind_dst()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Samuel Pitoiset
219be27a09 radv: rework DCC/CMASK/FMASK/HTILE allocations
Add helpers and some comments to make the thing more readable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-02 11:56:20 +02:00
Eric Engestrom
1262e828e7 meson: fix version typo + grammar
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-10-02 09:37:54 +01:00
Iago Toral Quiroga
5e584a9db7 i965: skip reading unused slots at the begining of the URB for the FS
We can start reading the URB at the first offset that contains varyings
that are actually read in the URB. We still need to make sure that we
read at least one varying to honor hardware requirements.

This helps alleviate a problem introduced with 99df02ca26 for
separate shader objects: without separate shader objects we assign
locations sequentially, however, since that commit we have changed the
method for SSO so that the VUE slot assigned depends on the number of
builtin slots plus the location assigned to the varying. This fixed
layout is intended to help SSO programs by avoiding on-the-fly recompiles
when swapping out shaders, however, it also means that if a varying uses
a large location number close to the maximum allowed by the SF/FS units
(31), then the offset introduced by the number of builtin slots can push
the location outside the range and trigger an assertion.

This problem is affecting at least the following CTS tests for
enhanced layouts:

KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations

which use SSO and the the location layout qualifier to select such
location numbers explicitly.

This change helps these tests because for SSO we always have to include
things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is
very unlikely to read them, so by doing this we free builtin slots from
the fixed VUE layout and we avoid the tests to crash in this scenario.

Of course, this is not a proper fix, we'd still run into problems if someone
tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or
gl_CullDistancein in the FS, but that would be a much less common bug and we
can probably wait to see if anyone actually runs into that situation in a real
world scenario before making the decision that more aggresive changes are
required to support this without reverting 99df02ca26.

v2:
- Add a debug message when we skip clip distances (Ilia)
- we also need to account for this when we compute the urb setup
  for the fragment shader stage, so add a compiler util to compute
  the first slot that we need to read from the URB instead of
  replicating the logic in both places.

v3:
- Make the util more generic so it can account for all unused slots
  at the beginning of the URB, that will make it more useful (Ken).
- Drop the debug message, it was not what Ilia was asking for.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-10-02 08:27:13 +02:00
Matt Turner
3cfd6ad01c i965: Normalize types for FBL, FBH, etc
Allows the instructions to be compacted. The documentation claims that
some of these only accept UD types, even though the type doesn't change
the operation performed. Just normalize the types to ensure we get
instruction compaction.

The only functional changes are for FBL and CBIT (always use UD types)
and FBH (always use the same types).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-30 20:18:09 -07:00
Marek Olšák
da3cf0e206 radeonsi: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
e90a2ed88e glx: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
9592c43a96 gallium/vl: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
874db83e24 egl/dri2: don't use the template keyword
for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-30 19:03:07 +02:00
Benedikt Schemmer
3797a82e78 radeonsi/uvd: clean up si_video_buffer_create
V2: remove code duplication and one unnessecary variable, minor whitespace fix

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-30 19:03:07 +02:00
Marek Olšák
e9cf64a67c radeonsi/uvd: fix planar formats broken since f70f6baaa3
Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-30 19:03:07 +02:00
Roland Scheidegger
740a1618c3 gallium: add new LOD opcode
The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-30 02:58:09 +02:00
Kamil Páral
d5e7ce28b5 drirc: whitelist glthread for Outlast
FPS increase 10-20% in starting locations on Core i5-4570 +
Radeon R9 270.
2017-09-29 20:53:32 +02:00
Jan Vesely
7148795665 travis: Add clover build using llvm-5.0
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-29 12:14:34 -04:00
Jan Vesely
8af90b59f9 travis: Add clover build using llvm-4.0
llvm-4 needs gcc 4.8:
http://releases.llvm.org/4.0.1/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-29 12:14:34 -04:00
Jan Vesely
b9a358a3e6 travis: Add clover build using llvm-3.9
Use r600,radeonsi instead of i915
Update binutils, new linker is required for llvm-3.9:
https://www.ubuntuupdates.org/package/core/trusty/universe/updates/binutils-2.26

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-29 12:14:34 -04:00
Leo Liu
361d8f82c0 st/va: add dst rect to avoid scale on deint
For 1080p video transcode, the height will be scaled to 1088 when deint
to progressive buffer. Set dst rect to make sure no scale.

Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint"

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andy Furniss <adf.lists@gmail.com>
2017-09-29 10:06:30 -04:00
Nicolai Hähnle
d190bfc1ad radeonsi: emit DLDEXP and DFRACEXP TGSI opcodes
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:07 +02:00
Nicolai Hähnle
061303e4fd radeonsi: emit LDEXP opcode
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:04 +02:00
Nicolai Hähnle
6de5147d20 st/glsl_to_tgsi: use LDEXP when available
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:03 +02:00
Nicolai Hähnle
cad959d901 gallium: add LDEXP TGSI instruction and corresponding cap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:08:01 +02:00
Nicolai Hähnle
2b0bfc51de tgsi: infer that dst[1] of DFRACEXP is an integer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:59 +02:00
Nicolai Hähnle
5cf279bf7e gallivm: add support for TGSI instructions with two outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:57 +02:00
Nicolai Hähnle
7af64b4d4a gallivm: add dst register index to lp_build_tgsi_context::emit_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:55 +02:00
Nicolai Hähnle
3c78215a1c tgsi: clarify the semantics of DFRACEXP
The status quo is quite the mess:

1. tgsi_exec will do a per-channel computation, and store the dst[0]
   result (significand) correctly for each channel. The dst[1] result
   (exponent) will be written to the first bit set in the writemask.
   So per-component calculation only works partially.

2. r600 will only do a single computation. It will replicate the
   exponent but not the significand.

3. The docs pretend that there's per-component calculation, but even
   get dst[0] and dst[1] confused.

4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
   and kind-of assumes that everything is replicated, generating this for
   the dvec4 case:

     DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
     DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
     DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
     DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw

Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:50 +02:00
Nicolai Hähnle
dbe7fc00d5 tgsi: fix the documentation of DLDEXP
Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:46 +02:00
Nicolai Hähnle
d713af711d tgsi: infer that DLDEXP's second source has an integer type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:33 +02:00
Nicolai Hähnle
93bf9c114b glsl/lower_instruction: handle denorms and overflow in ldexp correctly
GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.

Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 12:07:08 +02:00
Nicolai Hähnle
a208cd7ae4 util/queue: fix a race condition in the fence code
A tempting alternative fix would be adding a lock/unlock pair in
util_queue_fence_is_signalled. However, that wouldn't actually
improve anything in the semantics of util_queue_fence_is_signalled,
while making that test much more heavy-weight. So this lock/unlock
pair in util_queue_fence_destroy for "flushing out" other threads
that may still be in util_queue_fence_signal looks like the better
fix.

v2: rephrase the comment

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
2017-09-29 11:52:41 +02:00
Nicolai Hähnle
c49400a03b r600: cleanup set_occlusion_query_state
This fixes a warning caused by the fork (note the change in the function
signature):

../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’:
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
  rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:47:37 +02:00
Nicolai Hähnle
5184a1e8ee r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICS
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:47:34 +02:00
Nicolai Hähnle
797dd12c7b radeonsi: fix border color translation for integer textures
This fixes the extremely unlikely case that an application uses
0x80000000 or 0x3f800000 as border color for an integer texture and
helps in the also, but perhaps slightly less, unlikely case that 1 is
used as a border color.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:45:08 +02:00
Nicolai Hähnle
6eb9483912 radeonsi: clamp border colors for upgraded depth textures
The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.

Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:45:05 +02:00
Nicolai Hähnle
4c56e07029 radeonsi: clamp depth comparison value only for fixed point formats
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.

The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.

Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*

Fixes: d4d9ec55c5 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:44:50 +02:00
Nicolai Hähnle
7dfa891f32 radeonsi/gfx9: fix geometry shaders without output vertices
Not that those are super common or useful, but hey! Fun corner cases
of the API...

Fixes dEQP-GLES31.functional.geometry_shading.emit.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:43:09 +02:00
Nicolai Hähnle
a6ea4c1b93 amd/common: save an instruction in the build_cube_select sequence
Avoid a v_cndmask: the absolute value is free due to input modifiers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:43:07 +02:00
Nicolai Hähnle
5be5c1e0fa amd/common: fix build_cube_select
Fix the custom cube coord selection sequence to be identical to
the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling
with user-provided derivatives.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:43:04 +02:00
Nicolai Hähnle
8ea7d3a5c8 st/glsl_to_tgsi: fix conditional assignments to packed shader outputs
Overriding the default (no-op) swizzle is clearly counter-productive,
since the whole point is putting the destination register as one of
the source operands so that it remains unmodified when the assignment
condition is false.

Fragment depth and stencil outputs are a special case due to how their
source swizzles are manipulated in translate_src when compiling to
TGSI.

Fixes dEQP-GLES2.functional.shaders.conditionals.if.*_vertex
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:42:59 +02:00
Nicolai Hähnle
2703fa613b st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts
Found by address sanitizer.

The loop here tries to be safe, but in doing so, it ends up doing
exactly the wrong thing: the safe foreach is for when the loop
variable (inst) could be deleted and nothing else. However, this
particular can delete inst's successor, but not inst itself.

Fixes: 8c6a0ebaad ("st/mesa: add st fp64 support (v7.1)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-29 11:42:38 +02:00
Nicolai Hähnle
4ed419328d radeonsi: move descriptor logs to after corresponding draw/compute packet
It has to happen after descriptor uploads since otherwise we'll print out
the wrong GPU list / incorrectly claim descriptor corruption.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:37:06 +02:00
Nicolai Hähnle
9ddc6e16a9 amd/common: remove ac_shader_abi::chip_class
Redundant with the recently added ac_llvm_context::chip_class.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:37:03 +02:00
Nicolai Hähnle
5b86c53b47 gallium/radeon: fix a comment
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-29 11:36:46 +02:00
Iago Toral Quiroga
47e527bd81 i965/fs: force pull model for 64-bit GS inputs
Triggering the push model when 64-bit inputs are involved is not easy due to
the constrains on the maximum number of registers that we allow for this mode,
however, for GS with 'points' primitive type and just a couple of double
varyings we can trigger this and it just doesn't work because the
implementation is not 64-bit aware at all. For now, let's make sure that we
don't attempt this model whith 64-bit inputs and we always fall back to pull
model for them.

Also, don't enable the VUE handles in the thread payload on the fly when we
find an input for which we need the pull model, this is not safe: if we need
to resort to the pull model we need to account for that when we setup the
thread payload so we compute the first non-payload register properly. If we
didn't do that correctly and we enable it on-the-fly here then we will end up
VUE handles on the first non-payload register which will probably lead to
GPU hangs. Instead, always enable the VUE handles for the pull model so we
can safely use them when needed. The GS is going to resort to pull model
almost in every situation anyway, so this shouldn't make a significant
difference and it makes things easier and safer.

v2: Always enable the VUE handles for pull model, this is easier and safer
    and the GS is going to fallback to pull model almost always anyway (Ken)

v3: Only clamp the URB read length if we are over the maximum reserved for
    push inputs as we were doing in the original code (Ken).

v4: No need to clamp the urb read length if invocations > 1

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-29 08:18:25 +02:00
Jason Ekstrand
2df897cf1f i965/link: Use prog->nir instead of creating a temporary
This way, when NIR_PASS_V makes a clone of the shader (for testing
nir_clone), the new and lowered version gets re-assigned to prog->nir.

[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-28 16:20:41 -07:00
Jason Ekstrand
006533d5ef i965/link: Make more use of NIR_PASS
[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-28 16:20:35 -07:00
Jason Ekstrand
69ed3244d4 i965/link: Make better use of temporary variables
The way NIR_PASS works (and, by extension, nir_optimize) is that they
may clone the shader and throw the old one away.  (We use this for
testing nir_clone.)  It's better if we just make a temporary variable,
use it for everything, and re-assign to the gl_program at the end.

[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-28 16:19:54 -07:00
Thomas Helland
ce09364d4e util: fix in-class initialization of static member
Fix a compile error with G++ 4.4

string_buffer_test.cpp:43: error: ISO C++ forbids initialization of
member ‘str1’
string_buffer_test.cpp:43: error: making ‘str1’ static
string_buffer_test.cpp:43: error: invalid in-class initialization of
static data member of non-integral type ‘const char*’

Tested-by: Vinson Lee <vlee at freedesktop.org>

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103002
2017-09-28 23:22:07 +02:00
Eric Engestrom
a35f25068a REVIEWERS: add myself as a Meson reviewer
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-28 18:08:59 +01:00
Eric Engestrom
573a60f177 REVIEWERS: add Meson
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-28 18:08:01 +01:00
Dylan Baker
a118322b4e meson: remove duplicate libisl dependency in anv
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-28 10:06:00 -07:00
Brian Paul
4d5497d50d svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch cases
Silences a compiler warning.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-28 10:41:33 -06:00
Brian Paul
e8d09f80ea svga: trivial whitespace clean-ups in svga_screen.c 2017-09-28 10:41:33 -06:00
Brian Paul
f33fbe2cf9 gallium/util: use new util_vasprintf() function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-28 10:41:33 -06:00
Brian Paul
864148d69e util: add util_vasprintf() for Windows (v2)
We don't have vasprintf() on Windows so we need to implement it ourselves.

v2: compute actual length of output string, per Nicolai Hähnle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-28 10:41:33 -06:00
Brian Paul
76a4209dc0 st/mesa: don't call close() on Windows
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 10:41:33 -06:00
Neha Bhende
652bc4b537 svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Since our driver support arb_provoking_vertex, we can start
advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests

Tested piglit, glretrace on Hw 11 and Hw 13

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-28 10:41:33 -06:00
Marek Olšák
9d54025cd1 mesa: fix texture updates for ATI_fragment_shader
Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-28 17:48:33 +02:00
Lucas Stach
15e3657e43 etnaviv: optimize RS transfers
Currently we are blitting the whole resource when the RS is used to
de-/tile a resource. This can be very inefficient for large resources
where the transfer is only changing a small part of the resource
(happens a lot with glTexSubImage2D).

Optimize this by only blitting the tile aligned subregion of the
resource, which the transfer is going to change.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:41:07 +02:00
Lucas Stach
69eb93cbb9 etnaviv: add resource subregion copy
This is useful if we only need to copy part of a larger resource, mostly
when using the RS engine to de-/tile on pipe transfers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:41:01 +02:00
Lucas Stach
9df635844c etnaviv: support tile aligned RS blits
The RS can blit abitrary tile aligned subregions of a resource by
adjusting the buffer offset.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-28 17:40:49 +02:00
Leo Liu
6ed61b8d3f st/va: use pipe transfer_map to map upload buffer
The function pipe_buffer_map() is only for linear pipe buffer,
with height as 0, and it's not for any 2D textures.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 09:22:55 -04:00
Gwan-gyeong Mun
c951976b50 anv: add an assertion in genX(BeginCommandBuffer)
To check a valid usage requirement.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-28 13:20:14 +01:00
Gwan-gyeong Mun
d0d6a611d9 radv: add an assertion in radv_BeginCommandBuffer()
To check a valid usage requirement.

CID: 1401616

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-28 13:20:14 +01:00
Gwan-gyeong Mun
5603670bc0 gallium/docs: add reference links for resource_create method
It adds reference links for arguments usage and bind of resource_create().

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 13:20:14 +01:00
Gwan-gyeong Mun
c6c23e95a7 gallium/docs: fix a reference link for get_paramf
Previous get_paramf links same as get_param. It changes the reference link to
PIPE_CAPF_*

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-28 13:20:14 +01:00
Iago Toral Quiroga
8e627af59d i965: enable up to 32 inputs for geometry shaders in gen8+
We have been exposing only 16 since 1e3e72e305 with arguments
based on register pressure and the number of available GRFs, however,
our scalar backend will always limit the number of push registers
for GS threads to 24 and fallback to pull model for anything else,
so there is really no reason to lower the number under those arguments.

By bumping this up to 32 we make it the same as all the other stages,
which is a nice feature to have that can help applications in some
cases (I recently fixed a bug in CTS that assumed that the number
of input locations in a stage matches the number of output locations
in the previous stage for example).

Pre-gen8, we use the vector backend and push model, so in that case
the arguments in 1e3e72e305 are still valid.

v2: check if we have scalar GS instead of the hw gen to enable this (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-28 12:36:32 +02:00
Samuel Pitoiset
913bfd42a3 radv: set image view type when decompressing depth surfaces
This was missing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-28 08:52:38 +02:00
Eric Anholt
3e3772c1b3 broadcom/vc4: Fix release build
I remember thinking "gosh, it would be nice if I could do a kernel-style
'if (!IS_ENABLED(DEBUG))' instead of using an #ifdef, so the code was
compiled on both builds", and then forgot to test a release build anyway.

Fixes: a8fd58eae5 ("vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.")
Reported-by: Derek Foreman <derekf@osg.samsung.com>
2017-09-27 13:03:14 -07:00
Eric Anholt
a8fd58eae5 vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.
This has proven to be incredibly useful for debugging CMA allocation
failures and driving memory management improvements.  However, we don't
want to burden entry and exit from the BO cache with the labeling ioctl's
overhead on release builds.
2017-09-27 10:21:49 -07:00
Dylan Baker
673dda8330 meson: build "radv" vulkan driver for radeon hardware
This builds, installs, and has been tested on a r290x (Hawaii) with the Vulkan
CTS. It dies horribly in a fire at the same point for the meson build as the
autotools build.

v2: - enable radv by default
    - add shader cache support and enforce that it's built for radv
v3: - Fix typo in meson_options (Nicholas)
    - strip trailing 'svn' from llvm version before setting the version
      preprocessor flag (Bas)
    - Check for LLVM module requirements

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-27 09:12:34 -07:00
Dylan Baker
d1992255bb meson: Add build Intel "anv" vulkan driver
This allows building and installing the Intel "anv" Vulkan driver using
meson and ninja, the driver has been tested against the CTS and has
seems to pass the same series of tests (they both segfault when the CTS
tries to run wayland wsi tests).

There are still a mess of TODO, XXX, and FIXME comments in here. Those
are mostly for meson bugs I'm trying to fix, or for additional things to
implement for other drivers/features.

I have configured all intermediate libraries and optional tools to not
build by default, meaning they will only be built if they're pulled in
as a dependency of a target that will actually be installed) this allows
us to avoid massive if chains, while ensuring that only the bits that
need to be built are.

v2: - enable anv, x11, and wayland by default
    - add configure option to disable valgrind
v3: - fix typo in meson_options (Nicholas)
v4: - Remove dead code (Eric)
    - Remove change to generator that was from v0 (Eric)
    - replace if chain with loop (Eric)
    - Fix typos (Eric)
    - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric)
v5: - rebase on util string buffer implementation

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v4)
2017-09-27 09:12:19 -07:00
Dylan Baker
c8b9cf429f util/ralloc: Don't define assert with magic member without DEBUG
It is possible to have DEBUG disabled but asserts on (NDEBUG), which
cannot build because these asserts work on members that are only present
when DEBUG is on.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-09-27 09:07:28 -07:00
Dylan Baker
848da66222 intel: use a flag instead of setting PYTHONPATH
Meson doesn't allow setting environment variables for custom targets, so
we either need to not pass this as an environment variable or use a
shell script to wrap the invocation. The chosen solution has the
advantage of working for both autotools and meson.

v2: - put rules back in top scope (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-09-27 09:07:28 -07:00
Marek Olšák
a65db0ad1c st/dri: don't expose modifiers in EGL if the driver doesn't implement them
This unbreaks waffle/gbm (piglit/gbm) which fails initialization.

v2: also don't set queryDmaBufFormats

Reviewed-by: Daniel Stone <daniel@fooishbar.org>
2017-09-27 17:59:50 +02:00
Jason Ekstrand
4fe3913b96 vulkan/wsi/wayland: Return better error messages
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-27 08:32:36 -07:00
Jason Ekstrand
537b9bc3e4 vulkan/wsi/wayland: Copy wl_proxy objects from oldSwapchain if available
This should save us some round trips while resizing.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-27 08:32:36 -07:00
Jason Ekstrand
4369102498 vulkan/wsi/wayland: Stop caching Wayland displays
We originally implemented caching to avoid unneeded round-trips to the
compositor when querying surface capabilities etc. to set up the
swapchain.  Unfortunately, this doesn't work if vkDestroyInstance is
called after the Wayland connection has been dropped.  In this case, we
end up trying to clean up already destroyed wl_proxy objects which leads
to crashes.  In particular most of dEQP-VK.wsi.wayland is crashing
thanks to this problem.

This commit gets rid of the cache and simply embeds the wsi_wl_display
struct in the swapchain.  While we're at it, we can get rid of the
wl_event_queue that we were storing in the swapchain because we can just
use the one in the embedded wsi_wl_display.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Bugzilla: https://bugs.freedesktop.org/102578
Cc: mesa-stable@lists.freedesktop.org
2017-09-27 08:32:36 -07:00
Jason Ekstrand
77181d9580 vulkan/wsi/wayland: Refactor wsi_wl_display code
We convert it over to an inti/finish model and make create/destroy
wrappers for the former.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-27 08:32:36 -07:00
Jan Vesely
f67ceeffd4 clover: Query and export int64 atomics
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-27 11:13:22 -04:00
Adam Jackson
0852162950 glx: Be more tolerant in glXImportContext (v2)
Ugh the GLX code. __GLX_MAX_CONTEXT_PROPS is 3 because glxproto.h is
just a pile of ancient runes, so when the server begins sending more
than 3 context properties this code refuses to work _at all_.  Which is
all just silly. If _XReply succeeds, it will have buffered the whole
reply, we can just walk through each property one at a time.

v2: Now with no arbitrary limits. (Eric Anholt)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-09-27 10:11:37 -04:00
Tomasz Figa
23a09b4f67 egl/dri2: Implement swapInterval fallback in a conformant way (v2)
dri2_fallback_swap_interval() currently used to stub out swap interval
support in Android backend does nothing besides returning EGL_FALSE.
This causes at least one known application (Android Snapchat) to fail
due to an unexpected error and my loose interpretation of the EGL 1.5
specification justifies it. Relevant quote below:

    The function

        EGLBoolean eglSwapInterval(EGLDisplay dpy, EGLint interval);

    specifies the minimum number of video frame periods per buffer swap
    for the draw surface of the current context, for the current rendering
    API. [...]

    The parameter interval specifies the minimum number of video frames
    that are displayed before a buffer swap will occur. The interval
    specified by the function applies to the draw surface bound to the
    context that is current on the calling thread. [...] interval is
    silently clamped to minimum and maximum implementation dependent
    values before being stored; these values are defined by EGLConfig
    attributes EGL_MIN_SWAP_INTERVAL and EGL_MAX_SWAP_INTERVAL
    respectively.

    The default swap interval is 1.

Even though it does not specify the exact behavior if the platform does
not support changing the swap interval, the default assumed state is the
swap interval of 1, which I interpret as a value that eglSwapInterval()
should succeed if called with, even if there is no ability to change the
interval (but there is no change requested). Moreover, since the
behavior is defined to clamp the requested value to minimum and maximum
and at least the default value of 1 must be present in the range, the
implementation might be expected to have a valid range, which in case of
the feature being unsupported, would correspond to {1} and any request
might be expected to be clamped to this value.

Fix this by defaulting dri2_dpy's min_swap_interval, max_swap_interval
and default_swap_interval to 1 in dri2_setup_screen() and let platforms,
which support this functionality set their own values after this
function returns. Thanks to patches merged earlier, we can also remove
the dri2_fallback_swap_interval() completely, as with a singular range
it would not be called anyway.

v2: Remove dri2_fallback_swap_interval() completely thanks to higher
    layer already clamping the requested interval and not calling the
    driver layer if the clamped value is the same as current.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-27 11:35:47 +02:00
Marek Olšák
f70f6baaa3 gallium/radeon: consolidate PIPE_BIND_SHARED/SCANOUT handling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-27 10:38:46 +02:00
Samuel Pitoiset
3ab0cff32c radeonsi: remove useless check in si_blit_decompress_color()
That's unnecessary to double-check that dcc_offset is not 0
because all callers already check that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-27 09:31:24 +02:00
Samuel Pitoiset
eba2abf54b gallium/radeon: more use of vi_dcc_formats_are_incompatible()
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-27 09:31:24 +02:00
Samuel Pitoiset
8860b39d94 radv: store the amount of saved constants in the compute state
It's safer and more elegant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-27 09:26:44 +02:00
Samuel Pitoiset
bd7fd6a0e4 radv: remove useless radv_meta_{begin,end}_XXX() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-27 09:26:42 +02:00
George Kyriazis
e927cb55a9 swr: Remove unneeeded comparison
No need to check if screen->pipe != pipe, so we can just assign it.  Just do it.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-26 18:09:19 -05:00
George Kyriazis
b9aa0fa7d6 swr: Handle resource across context changes
Swr caches fb contents in tiles.  Those tiles are stored on a per-context
basis.

When switching contexts that share resources we need to make sure that
the tiles of the old context are being stored and the tiles of the new
context are being invalidated (marked as invalid, hence contents need
to be reloaded).

The context does not get any dirty bits to identify this case.  This has
to be, then, coordinated by the resources that are being shared between
the contexts.

Add a "curr_pipe" hook in swr_resource that will allow us to identify a
MakeCurrent of the above form during swr_update_derived().  At that time,
we invalidate the tiles of the new context.  The old context, will need to
have already store its tiles by that time, which happens during glFlush().
glFlush() is being called at the beginning of MakeCurrent.

So, the sequence of operations is:
- At the beginning of glXMakeCurrent(), glFlush() will store the tiles
  of all bound surfaces of the old context.
- After the store, a fence will guarantee that the all tile store make
  it to the surface
- During swr_update_derived(), when we validate the new context, we check
  all resources to see what changed, and if so, we invalidate the
  current tiles.

Fixes rendering problems with CEI/Ensight.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-26 18:09:15 -05:00
Jason Ekstrand
016de7e155 vulkan/wsi/wayland: Stop printing out the DRM device
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-26 15:45:48 -07:00
Kenneth Graunke
a553eb0fdf i965: Support copy propagating of untyped atomic surface indexes.
In the vec4 backend, SHADER_OPCODE_UNTYPED_ATOMIC's src[1] is the
surface index.  We want to copy propagate so we can use an immediate
message descriptor, rather than an indirect send.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-26 15:35:14 -07:00
Kenneth Graunke
66342c997f i965/vec4: Fix swizzles on atomic sources.
Atomic operation sources are scalar values, but we were failing to
select the .x component of the second operand.  For example,

   atomicCounterCompSwapARB(counter, 5u, 10u)

would generate

   mov(8) vgrf4.x:D, 5D
   mov(8) vgrf5.x:D, 10D

   mov(8) vgrf9.x:UD, vgrf4.xyzw:D
   mov(8) vgrf9.y:UD, vgrf5.xyzw:D

which wrongly selects the .y component of vgrf5, so the actual 10u value
would get dead code eliminated.  The swizzle works for the other source,
but both of them ought to be .xxxx.

Fixes the compare and swap CTS tests in:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase

Cc: "17.2 17.1 17.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-26 15:35:11 -07:00
Kenneth Graunke
a62fe34098 i965/vec4: Actually handle atomic op intrinsics.
Embarassingly, someone enabled the ARB_shader_atomic_counter_ops
extension for Gen7+ but never added the intrinsics to the switch
statement in the vec4 backend, so they just hit an unreachable()
call and died.

Fixes: 40dd45d0c6 (i965: Enable ARB_shader_atomic_counter_ops)
Cc: "17.2 17.1 17.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-26 15:35:06 -07:00
Kenneth Graunke
17eb2afada i965: Convert brw->*_program into a brw->programs[i] array.
This makes it easier to loop over programs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-09-26 15:29:16 -07:00
Eric Anholt
b99cf705c8 anv: Fix some comment typos.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-26 14:50:29 -07:00
Eric Anholt
6cc59de9cd gallium: Weaken assertion about u_mm's align2 field.
vc5 MMU mappings are access-controlled at a 128kb boundary, so the 4kb
here was too small for that purpose.  Allowing any valid align2 value that
u_mm's 32-bit addressing can represent will still catch most cases of
people passing in a byte alignment.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-26 14:50:29 -07:00
Eric Anholt
bb7c9789c2 intel/genxml: Convert a not-present-or-"1" dict to a set.
I was implementing the same enum support in broadcom's gen_pack_header.py,
and did this same simplification there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-26 14:50:29 -07:00
Boris Brezillon
ef578906d8 broadcom/vc4: Fix infinite retry in vc4_bo_alloc()
cleared_and_retried is always reset to false when jumping to the retry
label, thus leading to an infinite retry loop.

Fix that by moving the cleared_and_retried variable definitions at the
beginning of the function.  While we're at it, move the create variable
with the other local variables and explicitly reset its content in the
retry path.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes: 78087676c9 "vc4: Restructure the simulator mode."
2017-09-26 14:49:48 -07:00
Eric Anholt
68c91a87d7 broadcom/vc4: Keep pipe_sampler_view->texture matching the original texture.
I was overwriting view->texture with the shadow resource when we need to
do shadow copies (retiling or baselevel rebase), but that tripped up some
critical new sanity checking in state_tracker (making sure that stObj->pt
hasn't changed from view->texture through TexImage-related paths).

To avoid that, move the shadow resource to the vc4_sampler_view struct.

Fixes: f0ecd36ef8 ("st/mesa: add an entirely separate codepath for setting up buffer views")
2017-09-26 14:49:43 -07:00
Samuel Pitoiset
4b407a62c7 radv: fix saved compute state when doing statistics/occlusion queries
We are pushing 16-bytes of constants, so we have to save/restore
the same amount of data to avoid data corruption.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-26 23:14:48 +02:00
Daniel Stone
5b7e651364 Revert "wayland-drm: constify the callbacks struct"
The wayland-drm callback struct is referenced, rather than duplicated,
inside wayland-drm. Constifying this struct involved moving it on to the
stack; as a result, starting any EGL client on Wayland called into
random stack memory, and killed the compositor.

This reverts commit 1d0be5b3fe and
39d539e321.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Krzysztof Sobiecki <sobkas@gmail.com>
Fixes: 1d0be5b3fe ("wayland-drm: constify the callbacks struct")
2017-09-26 21:48:17 +01:00
Brian Paul
8822ea100c svga: silence unused var warning in optimized build with MAYBE_UNUSED
Trivial
2017-09-26 09:51:43 -06:00
Thomas Helland
d86bc36446 glcpp: Avoid unnecessary call to strlen
Length of the token was already calculated by flex and stored in yyleng,
no need to implicitly call strlen() via linear_strdup().

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>

V2: Also convert this pattern in glsl_lexer.ll

V3: Remove a misplaced comment

V4: Use a temporary char to avoid type change
    Remove bogus +1 on length check of identifier
2017-09-26 18:25:38 +02:00
Thomas Helland
e7220d2c22 glcpp: Use string_buffer for line continuation removal
Migrate removal of line continuations to string_buffer. Before this
it used ralloc_strncat() to append strings, which internally
each time calculates strlen() of its argument. Its argument is
entire shader, so it multiple time scans the whole shader text.

Signed-off-by: Vladislav Egorov <vegorov180@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>

V2: Adapt to different API of string buffer (Thomas Helland)
2017-09-26 18:25:20 +02:00
Thomas Helland
cad323f898 glsl: Change the parser to use the string buffer
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>

V2: Pointed out by Timothy
   - Fix pp.c reralloc size issue and comment

V3 - Use vprintf instead of printf where we should
   - Fixes failing make-check tests

V4 - Use buffer_append_char in a couple places
   - Use append_char in even more places
2017-09-26 18:25:00 +02:00
Thomas Helland
584a2a22ea util: Add tests for the string buffer
More tests could probably be added, but this should cover
concatenation, resizing, clearing, formatted printing,
and checking the length, so it should be quite complete.

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>

V2: Address review feedback from Timothy, plus fixes
   - Use a large enough char array
   - Actually test the formatted appending
   - Test that clear function resets string length

V3: Port to gtest

V4: Fix test makefile
    Fix copyright header
    Fix missing extern C
    Use more appropriate name for C-file
    Add tests for append_char
2017-09-26 18:24:46 +02:00
Thomas Helland
7885bb684d util: Add a string buffer implementation
Based on Vladislav Egorovs work on the preprocessor, but split
out to a util functionality that should be universal. Setup, teardown,
memory handling and general layout is modeled around the hash_table
and the set, to make it familiar for everyone.

A notable change is that this implementation is always null terminated.
The rationale is that it will be less error-prone, as one might
access the buffer directly, thereby reading a non-terminated string.
Also, vsnprintf and friends prints the null-terminator.

Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

V2: Address review feedback from Timothy and Grazvydas
   - Fix MINGW preprocessor check
   - Changed len from uint to int
   - Make string argument const in append function
   - Move to header and inline append function
   - Add crimp_to_fit function for resizing buffer

V3: Move include of ralloc to string_buffer.h

V4: Use u_string.h for a cross-platform working vsnprintf

V5: Remember to cast to char * in crimp function

V6: Address review feedback from Nicolai
   - Handle !str->buf in buffer_create
   - Ensure va_end is always called in buffer_append_all
   - Add overflow check in buffer_append_len
   - Do not expose buffer_space_left, just remove it
   - Clarify why a loop is used in vprintf, change to for-loop
   - Add a va_copy to buffer_vprintf to fix failure to append arguments
     when having to resize the buffer for vsnprintf.

V7: Address more review feedback from Nicolai
   - Add missing va_end corresponding to va_copy
   - Error check failure to allocate in crimp_to_fit
2017-09-26 18:24:33 +02:00
Timothy Arceri
379b24a40d i965: make use of nir linking
For now linking is just removing unused varyings between stages.

shader-db results BDW:

total instructions in shared programs: 13198288 -> 13191693 (-0.05%)
instructions in affected programs: 48325 -> 41730 (-13.65%)
helped: 473
HURT: 0

total cycles in shared programs: 541184926 -> 541159260 (-0.00%)
cycles in affected programs: 213238 -> 187572 (-12.04%)
helped: 435
HURT: 8

V2:
- lower indirects on demoted inputs as well as outputs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-26 22:37:02 +10:00
Timothy Arceri
49e4248a93 i965/nir: export nir_optimize
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Timothy Arceri
833e4dd41a i965: call brw_shader_gather_info() from the callers of brw_create_nir()
This will allow us to insert a nir linking step in brw_link_shader().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Timothy Arceri
348cf74792 i965: create a brw_shader_gather_info() helper
This will help us call gather info at a later point and allow us
to do some linking in nir.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Timothy Arceri
45ef10c06a nir: add some helpers for doing linking
The initial helpers add support for removing unused varyings between
stages.

V2:
- Moved the io mask helper function into this file rather than
  nir.h so it's not used elsewhere considering it doesn't handle
  all corner cases.
- Use bitmask rather than hash table to handle tcs outputs (Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-26 22:37:02 +10:00
Timothy Arceri
3529f8213f glsl: mark xfb varyings as always active
This will be used by the nir linking pass so that we don't remove
otherwise unused varyings.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Timothy Arceri
4244bea859 nir: add always_active_io to nir variable
Will be used in nir link pass to decided if we can remove a varying
or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-09-26 22:37:02 +10:00
Marek Olšák
06bfb2d28f r600: fork and import gallium/radeon
This marks the end of code sharing between r600 and radeonsi.
It's getting difficult to work on radeonsi without breaking r600.

A lot of functions had to be renamed to prevent linker conflicts.
There are also minor cleanups.

Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-26 04:21:14 +02:00
Kenneth Graunke
e1623da818 i965: Rename do_flush_locked to submit_batch().
do_flush_locked isn't a great name - especially given that there's no
locking going on in our code relating to execbuf.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-25 15:15:13 -07:00
Kenneth Graunke
962cc1bd17 i965: Use atomic ops in get_new_program_id().
We have a nice utility function for this, which eliminates the need for
locking stuff.  This isn't really performance critical, but it's less
code to use the atomic.

p_atomic_inc_return does pre-increment rather than post-increment, so we
change screen->program_id to be initialized to 0 instead of 1.  At which
point, we can just delete the initialization because intel_screen is
rzalloc'd.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-25 15:15:09 -07:00
Kenneth Graunke
2eb26a9986 i965: Convert brw_bufmgr to use C11 mutexes instead of pthreads.
There's no real advantage or disadvantage here, it's just for stylistic
consistency with the rest of the codebase.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-25 15:15:01 -07:00
Kenneth Graunke
93bb91452a i965: Delete dead meta stencil blit program fields from brw_context.
These have been unused for a while now.
2017-09-25 15:14:44 -07:00
Tim Rowley
5a2bca5db5 swr/rast: Handle instanceID offset / Instance Stride enable
Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require
similar changes, will need address this if it is determined that this
path is still in use.

Handle Force Sequential Access in FetchJit::Create.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
68d8dd1fb5 swr/rast: Remove code supporting legacy llvm (<3.9)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
9c468c775b swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTEND
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
d18c2a1fa4 swr/rast: Slightly more efficient blend jit
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
5033d49d5d swr/rast: Properly sized null GS buffer
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
9c82cf0f1e swr/rast: Move SWR_GS_CONTEXT from thread local storage to stack
Move structure, as the size is significantly reduced due to dynamic
allocation of the GS buffers.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
efe7fa4384 swr/rast: Fetch compile state changes
Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to
FETCH_COMPILE_STATE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
cd6e91d3a2 swr/rast: New GS state/context API
One piglit regression, which was a false pass:
  spec@glsl-1.50@execution@geometry@dynamic_input_array_index

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Tim Rowley
41565ddf7a swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel format
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-25 13:38:57 -05:00
Samuel Pitoiset
1cf508b731 radv: save/restore all viewports/scissors for meta operations
This is needed since we don't update the number of viewports/scissors
when they are set dynamically (according to the spec). In the following
scenario:

* vkCmdSetViewport()
* vkCmdClearColorImage() (or any other meta operations)

The viewports/scissors weren't saved correctly because no pipeline
was bound before, and thus the number of viewports/scissors were 0.

This fixes a regression with:

dEQP-VK.draw.negative_viewport_height.front_ccw_cull_back

Fixes: 60878dd00c ("radv: do not update the number of viewports in vkCmdSetViewport()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-25 20:31:55 +02:00
Juan A. Suarez Romero
0509b27b9d docs: update calendar, add news item and link release notes for 17.1.10
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-25 17:08:20 +00:00
Juan A. Suarez Romero
0ac0e32ce1 docs: add sha256 checksums for 17.1.10
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 60df95c6bd)
2017-09-25 17:01:22 +00:00
Juan A. Suarez Romero
3e9ba8d0f5 docs: add release notes for 17.1.10
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 834d6c60db)
2017-09-25 17:01:18 +00:00
Eric Engestrom
adea68a8a2 git_sha1_gen: fix output on python3
String handling has changed on python3.

Before this patch, on python3:
	#define MESA_GIT_SHA1 "git-b'b99dcbfeb3'"
After:
	#define MESA_GIT_SHA1 "git-b99dcbfeb3"

(No change on python2, it always looked ok)

Cc: Jose Fonseca <jfonseca@vmware.com>
Fixes: b99dcbfeb3 "build: Convert git_sha1_gen script to Python."
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-25 14:58:26 +01:00
Leo Liu
f3ed1d2f6b st/va/postproc: implement the DRM prime grabber
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
b47bdf55dc vl/compositor: convert RGB buffer to YUV with color conversion
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
737d13637d vl/csc: add a RGB to YUV CSC matrix
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
a2ebe57992 vl/compositor: create RGB to YUV fragment shader
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
169c077d1d st/va/postproc: use progressive target buffer for scaling
Scaling between interlaced buffers, esp. for scale-up, because
blit will scale up top filed and bottom field separately. it'll
result in the weaving for these buffer with lack of accuracy.
So use shader deint for the case.

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
1d1299f8a4 st/va: make internal func vlVaHandleSurfaceAllocate() call simpler
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
96f89f440b st/va/postproc: add a full NV12 deint support from buffer I to P
Before it's impossible to transcode an interlaced video, becasue if
in order for encoder to work, we have to force buffer to progessive,
but the deint with buffer from I to P is missing. Now along With
the new YUV deint full function, it works with weave and bob deint.

Also this will benefit transcoding video with scaling parameters.

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
4f9e7b1279 vl/compositor: add Bob top and bottom to YUV deint function
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:14 -04:00
Leo Liu
9484852cdb vl/compositor: remove vl_compositor_yuv_deint() function
No longer used.

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
3ad8687295 st/va: use new vl_compositor_yuv_deint_full() to deint
We also set src rectangle explicitly just in case of the mismatch
of size between interlaced buffer and progressive buffer

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
db28fdc0ad st/omx: use new vl_compositor_yuv_deint_full() to deint
v2: add dst rect to make sure no scale

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
001358a97c vl/compositor: add a new function for YUV deint
It will replace previous deint function with abilities of
scaling and field deinterlacing

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
abd05a6cc4 vl/compositor: extend YUV deint function to do field deint
It will add Bob deint ability to interlaced video for HW encoder

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
4ef0828946 vl/compositor: separate YUV part from shader video buffer function
So that it can be re-used

Acked-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Leo Liu
eb51838771 st/va/postproc: use video original size for postprocessing
Otherwise the aligned size will make video scaled

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-25 09:42:13 -04:00
Bas Nieuwenhuizen
3e685ec983 radv: Fix VK_KHR_image_format_list.
Spec adding corner cases ...

Fixes: 969537d935 "radv: Add support for more DCC compression with VK_KHR_image_format_list."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-25 15:36:19 +02:00
Bas Nieuwenhuizen
bf0397b6f5 Revert "Revert "radv: fallback to an in-memory cache when no pipline cache is provided""
I tested this 10 times with
./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4*

and one full run of CTS, seems the issue is gone.

Also reduces CTS runtime by 30% or so.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-25 15:36:19 +02:00
Eric Engestrom
bb66af95a7 scons: use python3-compatible exceptions
These changes were generated using python's `2to3` tool.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-25 12:05:47 +01:00
Eric Engestrom
eb2efbba78 scons: use python3-compatible generator
These changes were generated using python's `2to3` tool.

Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-25 12:05:47 +01:00
Eric Engestrom
e361047568 scons: use python3-compatible lists
These changes were generated using python's `2to3` tool.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-25 12:05:44 +01:00
Eric Engestrom
29c8d755ea scons: use python3-compatible list-key check
These changes were generated using python's `2to3` tool.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-25 11:58:53 +01:00
Eric Engestrom
7d48219b3a scons: use python3-compatible print()
These changes were generated using python's `2to3` tool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852
Reported-by: Alex Granni <liviuprodea@yahoo.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-25 11:57:12 +01:00
Samuel Pitoiset
3f6a17a8fc radv: init the trace BO before compiling meta shaders
Otherwise, the disasm string is NULL for meta shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-25 10:47:01 +02:00
Samuel Pitoiset
6f8c40734b radv: make radv_pipeline_init() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-25 10:46:59 +02:00
Samuel Pitoiset
2aea632292 radv: remove unused variable in radv_dump_annotated_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-25 10:46:57 +02:00
Samuel Pitoiset
45ea90ef1f radv: make use of ATI_VENDOR_ID everywhere
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-25 10:46:55 +02:00
Samuel Iglesias Gonsálvez
d2cd9deeb8 anv: fix viewport transformation for z component
In Vulkan, for 'z' (depth) component, the scale and translate values
for the viewport transformation are:

pz = maxDepth - minDepth
oz = minDepth

zf = pz × zd + oz

Being zd, the third component in vertex's normalized device coordinates.

Fixes: dEQP-VK.draw.inverted_depth_ranges.*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-09-25 06:39:40 +02:00
David Airlie
3e54493265 radv: add gfx9 scissor workaround
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
2017-09-24 05:50:02 +02:00
Adam Jackson
52ed3bca91 glx: Sort the GLX extension bit enum and table
Not quite asciibetical: ARB, then EXT, then vendor, just like the GL
extension enum just below. No functional change, but it bothered me.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-22 21:13:45 -04:00
Wladimir J. van der Laan
3f7093bed2 etnaviv: Add missing includes after 6ace0b8
Add missing includes after 6ace0b8 (etnaviv: don't enable RT
full-overwrite when logicop is enabled), otherwise the etnaviv driver
won't build because of missing macros.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-by: Andres Gomez <agomez@igalia.com>
2017-09-22 20:49:03 +02:00
Lucas Stach
e9d37d68cf etnaviv: fix 16bpp clears
util_pack_color may leave undefined values in the upper half of the packed
integer. As our hardware needs the upper 16 bits to mirror the lower 16bits,
this breaks clears of those formats if the undefined values aren't masked off.

I've only observed the issue with R5G6B5_UNORM surfaces, other 16bpp
formats seem to work fine.

Fixes: d6aa2ba2b2 (etnaviv: replace translate_clear_color with util_pack_color)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-09-22 20:48:32 +02:00
Tim Rowley
066d1dc951 swr/rast: remove llvm fence/atomics from generated files
We currently don't use these instructions, and since their API
changed in llvm-5.0 having them in the autogen files broke the mesa
release tarballs which ship with generated autogen files.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847
CC: mesa-stable@lists.freedesktop.org
Tested-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-22 11:38:57 -05:00
Jason Ekstrand
d372683339 vulkan: enum generator: Generate entries for extended enums
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Jason Ekstrand
a2fa09efd3 vulkan: enum generator: Stop using iterparse
While using iterparse is potentially a little more efficient, the Vulkan
registry XML is not large and using regular element tree simplifies the
parsing logic substantially.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Lionel Landwerlin
0ee868c1f6 vulkan: enum generator: generate extension number defines
New extensions can introduce additional enums. Most of the new enums
will have disjoint numbers from the initial enums. For example new
formats introduced by VK_IMG_format_pvrtc :

VK_FORMAT_ASTC_10x8_UNORM_BLOCK = 177,
VK_FORMAT_ASTC_10x8_SRGB_BLOCK = 178,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK = 179,
VK_FORMAT_ASTC_10x10_SRGB_BLOCK = 180,
VK_FORMAT_ASTC_12x10_UNORM_BLOCK = 181,
VK_FORMAT_ASTC_12x10_SRGB_BLOCK = 182,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK = 183,
VK_FORMAT_ASTC_12x12_SRGB_BLOCK = 184,
VK_FORMAT_PVRTC1_2BPP_UNORM_BLOCK_IMG = 1000054000,
VK_FORMAT_PVRTC1_4BPP_UNORM_BLOCK_IMG = 1000054001,
VK_FORMAT_PVRTC2_2BPP_UNORM_BLOCK_IMG = 1000054002,
VK_FORMAT_PVRTC2_4BPP_UNORM_BLOCK_IMG = 1000054003,
VK_FORMAT_PVRTC1_2BPP_SRGB_BLOCK_IMG = 1000054004,
VK_FORMAT_PVRTC1_4BPP_SRGB_BLOCK_IMG = 1000054005,
VK_FORMAT_PVRTC2_2BPP_SRGB_BLOCK_IMG = 1000054006,
VK_FORMAT_PVRTC2_4BPP_SRGB_BLOCK_IMG = 1000054007,

It's obvious we can't have a single table for handling those anymore.

Fortunately the enum values actually contain the number of the
extension that introduced the new enums. So we can build an
indirection table off the extension number and then index by
subtracting the first enum of the the format enum value.

This change makes the extension number available in the generated enum
code.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Lionel Landwerlin
7e90fc54e5 vulkan: enum generator: make registry more flexible
It will be used to store extension numbers as well.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Lionel Landwerlin
935b42d9bc vulkan: enum generator: sort enums by names
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Lionel Landwerlin
0ac7b84672 vulkan: enum generator: align function declarations/prototypes
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-09-22 07:47:34 -07:00
Grazvydas Ignotas
16e884d9e3 util/u_atomic: remove unnecessaty __atomic functions
They are now provided by -latomic, which should be linked as needed
since previous commit.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-22 17:12:24 +03:00
Grazvydas Ignotas
2ef7f23820 configure: check if -latomic is needed for __atomic_*
On some platforms, gcc generates library calls when __atomic_* functions
are used, but does not link the required library (libatomic) automatically
(supposedly to allow the app to use some other atomics implementation?).

Detect this at configure time and add the library when needed. Tested
on armel (library was added) and on x86_64 (was not, as expected).

Some documentation on this is provided in GCC wiki:
https://gcc.gnu.org/wiki/Atomic/GCCMM

Fixes: 8915f0c0 "util: use GCC atomic intrinsics with explicit memory model"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102573
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-22 17:01:21 +03:00
Lucas Stach
6ace0b8bc8 etnaviv: don't enable RT full-overwrite when logicop is enabled
Logicop is a form of blending with the framebuffer, so we must allow
framebuffer reads when logicop is enabled.

Fixes: piglit gl-1.0-logicop on GC3000, which has logicop support

Signed-off-by: Lucas Stach <dev@lynxeye.de>
2017-09-22 12:30:42 +02:00
Anuj Phogat
7567e3ece8 Revert "intel: Remove unused Kabylake pci ids
drm-intel is in favor of keeping the unused pci-id's which
are still listed in the h/w specs. To keep it uniform
across multiple gfx stack components, I'm reverting below
Mesa patches:
b2dae9f8fd
ebc5ccf3cc.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2017-09-21 14:12:27 -07:00
Anuj Phogat
f2723980b9 Revert "intel: Remove unused device info for KBL GT1.5"
This reverts commit 4c4c28ca70.

GT1.5 device info is required for few reserved pci-id's.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-09-21 14:12:19 -07:00
Thomas Helland
030f4ecf74 gallium/util: Remove unused keymap
This is not used anywhere in the codebase. It's a hashtable
implementation that is based around cso_hash, and is therefore
(and as mentioned in a comment in the source) quite similar to
u_hash_table.

CC: Brian Paul<brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-21 20:42:38 +02:00
Kenneth Graunke
ef719f4fd2 i965: Force outputs_written to contain varyings needed by stream-out.
If transform feedback is recording a varying, it needs a slot in the
VUE map, regardless of whether or not the shader writes it.

Together with the previous patch, this fixes:
- KHR-GL45.enhanced_layouts.xfb_capture_struct

The test captures a structure where the vertex shader writes the first
and third members - but the second still needs a slot.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-21 09:39:32 -07:00
Kenneth Graunke
6d6fae95a3 i965: Compute VS/GS output VUE map from the NIR info.
unify_interfaces() only updates the NIR program info, not the copy
in the gl_program itself.  So, by using the old copy, we were missing
out on these updates.

The TCS/TES ones already did this correctly.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-21 09:39:31 -07:00
Kenneth Graunke
c9fbe772ba i965: Handle unwritten PSIZ/VIEWPORT/LAYER outputs in vec4 shaders.
This can occur if the shader is capturing some of the values from the
VUE header for transform feedback, but the shader hasn't written all of
them.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-09-21 09:39:27 -07:00
Kenneth Graunke
3bec992e36 i965: Fix brw_finish_batch to grow the batchbuffer.
brw_finish_batch emits commands needed at the end of every batch buffer,
including any workarounds.  In the past, we freed up some "reserved"
batch space before calling it, so we would never have to flush during
it.  This was error prone and easy to screw up, so I deleted it a while
back in favor of growing the batch.

There were two problems:

1. We're in the middle of flushing, so brw->no_batch_wrap is guaranteed
   not to be set.  Using BEGIN_BATCH() to emit commands would cause a
   recursive flush rather than growing the buffer as intended.

2. We already recorded the throttling batch before growing, which
   replaces brw->batch.bo with a different (larger) buffer.  So growing
   would break throttling.

These are easily remedied by shuffling some code around and whacking
brw->no_batch_wrap in brw_finish_batch().  This also now includes the
final workarounds in the batch usage statistics.  Found by inspection.

Fixes: 2c46a67b41 (i965: Delete BATCH_RESERVED handling.)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-21 09:33:50 -07:00
Kenneth Graunke
5a746021ce i965: Move MI_BATCHBUFFER_END handling into brw_finish_batch().
This is, by definition, finishing the batch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-21 09:33:47 -07:00
Nicholas Miell
1f25436079 radv: Implement VK_AMD_rasterization_order
Tested with AMD's Anvil OutOfOrderRasterization demo on a RX 560.

Signed-off-by: Nicholas Miell <nmiell@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-21 18:10:00 +02:00
Brian Paul
5513f01f72 glsl: silence signed/unsigned comparison warning
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-21 10:02:17 -06:00
Ilia Mirkin
c5b4a5b967 nv20: Enable ARB_texture_border_clamp
Fixes quite a few 'texwrap [12]d border color only' tests on NV20
(10de:0201).  All told, 40 more tests pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian RomanicK <ian.d.romanick@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Ian RomanicK <ian.d.romanick@intel.com>
2017-09-21 10:29:49 -05:00
Ian Romanick
953a3cf0fd nv20: Fix GL_CLAMP
v2: Force T and R wrap modes to GL_CLAMP_TO_EDGE for 1D textures.
This fixes a regression in tex1d-2dborder.  The test uses a 1D texture
but it provides S and T texture coordinates.  Since the T wrap mode
would (correctly) be set to GL_CLAMP, the texture would gradually
blend (incorrectly) with the border color.

I also tried setting NV20_3D_TEX_FORMAT_DIMS_1D instead of
NV20_3D_TEX_FORMAT_DIMS_2D for 1D textures, but that did not help.

It is possible that the same problem exists for 2D textures with the
R-wrap mode, but I don't think there are any piglit tests for that.

No test changes on NV20 (10de:0201).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-21 10:28:32 -05:00
Jan Vesely
9c87150618 gallium: Add PIPE_SHADER_CAP_INT64_ATOMICS
Denotes availability of 64bit int atomic instructions

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-21 11:18:17 -04:00
Nicolai Hähnle
df8767a14e glsl/linker: properly fix output variable overlap check
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102904
Fixes: 15cae12804 ("glsl/linker: fix output variable overlap check")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-21 11:04:21 +02:00
Nicolai Hähnle
eb71394ff3 ac/surface: handle error when choosing preferred swizzle mode
CID: 1418140
Fixes: c4ac522511 ("ac/surface: handle S8 on gfx9")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-21 11:00:00 +02:00
Nicolai Hähnle
34126ed248 amd/addrlib: fix missing va_end() after va_copy()
There's no reason to use va_copy here.

CID: 1418113
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Fixes: e7fc664b91 ("winsys/amdgpu: add addrlib - texture
                              addressing and alignment calculator")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-21 10:59:36 +02:00
Samuel Pitoiset
8e9e339c53 radv: copy the number of viewports/scissors at pipeline bind time
The number of viewports/scissors can only be specified at pipeline
creation time, so make sure to copy them when binding a new one
because the dynamic state is cleared in BeginCommandBuffer().

Fixes: dcf46e995d ("radv: do not update the number of scissors in vkCmdSetScissor()")
Fixes: 60878dd00c ("radv: do not update the number of viewports in vkCmdSetViewport()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-21 09:03:20 +02:00
Topi Pohjolainen
3a1b7efce8 intel/blorp/hiz: Always set sample number
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-09-21 08:44:25 +03:00
Topi Pohjolainen
a6ab632ef7 i965/gen8: Remove unused gen8_emit_3dstate_multisample()
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-09-21 08:43:20 +03:00
Tapani Pälli
589457d97f mesa: free current ComputeProgram state in _mesa_free_context_data
This is already done for other programs stages, fixes a leak when using
compute programs.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102844
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-21 08:10:51 +03:00
Timothy Arceri
db4222fc54 mesa/st: fix infinite loops
Fixes: 9ac8fece63 (glsl: Unify ir_constant::const_elements and ::components)
Reviewed-by: Dylan Baker <dylanx.c.baker@intel.com
2017-09-21 13:28:09 +10:00
Timothy Arceri
a40b3d5a3c glsl: merge loop_controls.cpp with loop_unroll.cpp
Having this separate just makes the code harder to follow, and
requires an extra walk of the IR.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-09-21 11:56:21 +10:00
Timothy Arceri
e7424b2d73 glsl: move loop analysis helpers to loop_analysis.cpp
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-09-21 11:56:13 +10:00
Jason Ekstrand
d8eede1697 anv: Advertise VK_KHR_maintenance2
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
772b07d91e anv/image: Use RENDER_SURFACE_STATE::X/Y Offset on SKL+
The Broadwell method of handling uncompressed views of compressed
textures was to make the texture linear and have a tiled shadow copy.
This isn't needed on Sky Lake because the HALIGN and VALIGN parameters
are specified in surface elements and required to be a multiple of 4.
This means that we can just use the X/Y Offset fields and we can avoid
the shadow copy song and dance.  This also makes ASTC work because ASTC
can't be linear and so the shadow copy method doesn't work there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
64f2aabcec intel/blorp: Handle clearing compressed surfaces
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
f395d0abc8 intel/blorp: Internally expose surf_convert_to_uncompressed
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
1e5fd2f839 anv/image: Support creating uncompressed views of compressed images
In order to get support everywhere, this gets a bit complicated.  On Sky
Lake and later, everything is fine because HALIGN/VALIGN are specified
in surface elements and are required to be at least 4 so any offsetting
we may need to do falls neatly within the heavy restrictions placed on
the X/Y Offset parameter of RENDER_SURFACE_STATE.  On Broadwell and
earlier, HALIGN/VALIGN are specified in pixels and are hard-coded to
align to exactly the block size of the compressed texture.  This means
that, when reinterpreted as a non-compressed texture, the tile offsets
may be anything and we can't rely on X/Y Offset.

In order to work around this issue, we fall back to linear where we can
trivially offset to whatever element we so choose.  However, since
linear texturing performance is terrible, we create a tiled shadow copy
of the image to use for texturing.  Whenever the user does a layout
transition from anything to SHADER_READ_ONLY_OPTIMAL, we use blorp to
copy the contents of the texture from the linear copy to the tiled
shadow copy.  This assumes that the client will use the image far more
for texturing than as a storage image or render target.

Even though we don't need the shadow copy on Sky Lake, we implement it
this way first to make testing easier.  Due to the hardware restriction
that ASTC must not be linear, ASTC does not work yet.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
2c8058fb68 anv: Add a new anv_surface_state struct
This struct represents a full surface state including the addresses of
the referenced main and auxiliary surfaces (if any).  This makes
relocation setup substantially simpler and allows us to move 100% of the
surface state setup logic into anv_image where it belongs.  Before, we
were manually fishing data out of surface states when emitting
relocations so we knew how to offset aux address.  It's best to keep all
of the surface state emit logic together.  This also gets us closer, at
least cosmetically, to a world of no relocations where addresses are
placed in surface states up-front.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
22e6858b2b anv/image: Break surface state fill logic into a helper
This gives us a single centralized place where we take an image view and
use it to fill out a surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
c7716718ac anv/image: Add support for the VkImageViewUsageCreateInfoKHR struct
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Samuel Iglesias Gonsálvez
c71e5c30a5 anv: Advertise point clipping properties
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
29680ff9a8 anv: Add support for tessellation domain origin control
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
fc91cbe20b spirv: Flip the tessellation winding order
It's not SPIR-V that's backwards from GLSL, it's Vulkan that's backwards
from GL.  Let's make NIR consistent with the source language and do the
flipping inside the Vulkan driver instead.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-20 17:21:06 -07:00
Jason Ekstrand
2891115671 anv/image: Add support for the new depth/stencil layouts
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 17:21:06 -07:00
Jan Vesely
3a5b69c09b clover: Wait for requested operation if blocking flag is set
v2: wait in map_buffer and map_image as well
v3: use event::wait instead of wait (skips fence wait for hard_event)
v4: use wait_signalled()

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2017-09-20 18:48:46 -04:00
Francisco Jerez
bc4000ee40 clover: Run the associated action before an event is signalled.
And define a method for other threads to wait until the action
function associated with an event has been executed to completion.

For hard events, this will mean waiting until the corresponding
command has been submitted to the pipe driver, without necessarily
flushing the pipe_context and waiting for the actual command to be
processed by the GPU (which is what hard_event::wait() already does).

This weaker kind of event wait will allow implementing blocking memory
transfers efficiently.

Acked-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2017-09-20 18:48:41 -04:00
Francisco Jerez
02f8ac6b70 clover: Wrap event::wait_count in a method taking care of the required locking.
Acked-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
2017-09-20 18:48:28 -04:00
Jason Ekstrand
ae8c7c703b anv/entrypoints_gen: Dedent the C code
This makes the C code be justified over to the left.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-20 14:56:45 -07:00
Bas Nieuwenhuizen
d235ff6e8f radv: Don't use a virtual function for getting the buffer virtual address.
We are really not going to use a winsys which does not need to store
the va, so might as well store it in a standard field.

Not sure this helps perf much though, as most of the cost is in the
cache miss accessing the bo anyway, which we stil need to do.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-20 22:04:25 +02:00
Bas Nieuwenhuizen
ef721c77f1 radv: Only enter the immutable samplers init loop when we have some.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-20 22:04:25 +02:00
Bas Nieuwenhuizen
68dc19d400 radv: Use for_each_bit in the descriptor set flush.
Since most games use only a few, iterating through all of them is
a waste. Simplifies the code too.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-20 22:04:25 +02:00
Bas Nieuwenhuizen
25ea385279 radv: Add VK_KHR_bind_memory2 support.
Nothing too exciting, just adding the possibility for a pNext pointer,
and batch binding. Our binding is pretty much trivial.

It also adds VK_IMAGE_CREATE_ALIAS_BIT_KHR, but since we store no
state in radv_image, I don't think we have to do anything there.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-20 21:48:35 +02:00
Roland Scheidegger
886626960b llvmpipe, gallivm: implement lod queries (LODQ opcode)
This uses all the existing code to calculate lod values for mip linear
filtering. Though we'll have to disable the simplifications (if we know some
parts of the lod calculation won't actually matter for filtering purposes due
to mip clamps etc.). For better or worse, we'll also disable lod calculation
hacks (mostly should make a difference for cube maps) always - the issue with
per-pixel lod being difficult is mostly because we then have different mipmaps
needed for the actual texel fetch, which isn't a problem with lodq.
We still use approximation for the log2 - for that reason I believe the float
part of the lod is only accurate to about 4-5 bits (and one bit less with 1d
textures actually) which is hopefully good enough (though d3d10 technically
requires 6 bits - could use quadratic interpolation instead of linear to get
8 bits or so).
Since lodq requires unclamped lod, we also have to move some sampler key
calculations to texture sampling code - even if we know we're going to access
mipmap 0 we still have to calculate lod and apply lod_bias for lodq.

Passes piglit ARB_texture_query_lod tests (after having fixed the test).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-20 21:18:54 +02:00
Louis-Francis Ratté-Boulianne
4b41361894 i965: Fix duplication of DRI images
Some DRI image properties weren't properly duplicated in the
new image. Some properties are still missing, but I'm not
certain if there was a good reason to let them out in the first
place.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-20 07:07:05 -07:00
Nicolai Hähnle
704ddbcdf6 radeonsi: set MIP_POINT_PRECLAMP to 0
This fixes a bug with nearest ("point") mip selection when the fractional
part of max_lod is in (0.5,1). In this case, the spec mandates that
we still select the mip level ceil(max_lod) in the clamping case. However,
MIP_POINT_PRECLAMP will clamp before the mip selection, which is wrong.

Supposedly this setting was originally copied from the closed Vulkan
driver, but as far as I can tell, closed Vulkan was actually changed back
recently :)

Fixes dEQP-GLES3.functional.texture.mipmap.2d.max_lod.{nearest,linear}_nearest

Fixes: f7420ef5b4 ("radeonsi: enable some sampler fields to match the closed driver")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 15:43:13 +02:00
Nicolai Hähnle
87f7c7bd65 radeonsi: fix array textures layer coordinate
Like for cube map (array) gather, we need to round to nearest on <= VI.

Fixes tests in dEQP-GLES3.functional.shaders.texture_functions.texture.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-20 15:42:58 +02:00
Nicolai Hähnle
15cae12804 glsl/linker: fix output variable overlap check
Prevent an overflow caused by too many output variables. To limit the
scope of the issue, write to the assigned array only for the non-ES
fragment shader path, which is the only place where it's needed.

Since the function will bail with an error when output variables with
overlapping components are found, (max # of FS outputs) * 4 is an upper
limit to the space we need.

Found by address sanitizer.

Fixes dEQP-GLES3.functional.attribute_location.bind_aliasing.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-20 15:35:57 +02:00
Christian Gmeiner
62a8ca22cd etnaviv: move sw query defines to etnaviv_query_sw.h
Also add new define ETNA_SW_QUERY_BASE.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-20 14:25:41 +02:00
Christian Gmeiner
a3d79946e5 etnaviv: move sw get_driver_query_info(..)
This change makes etna_get_driver_query_info(..) more generic
and puts the knowledge of supported queries directly besides
the implementation.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-09-20 14:25:13 +02:00
Józef Kucia
65a09f98ad anv: Fix descriptors copying
Trivial.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-20 13:14:49 +02:00
Samuel Pitoiset
dcf46e995d radv: do not update the number of scissors in vkCmdSetScissor()
The Vulkan spec (1.0.61) says:

   "The number of scissors used by a pipeline is still specified
    by the scissorCount member of VkPipelinescissorStateCreateInfo."

So, the number of scissors is defined at pipeline creation
time and shouldn't be updated when they are set dynamically.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
60878dd00c radv: do not update the number of viewports in vkCmdSetViewport()
The Vulkan spec (1.0.61) says:

   "The number of viewports used by a pipeline is still specified
   by the viewportCount member of VkPipelineViewportStateCreateInfo."

So, the number of viewports is defined at pipeline creation
time and shouldn't be updated when they are set dynamically.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
505c2fea3a radv: add some assertions in vkCmdSetScissor()
To check some valid usage requirements.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
2ad1f20cd0 radv: add some assertions in vkCmdSetViewport()
To check some valid usage requirements.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
e5b6cdbf45 radv: inline radv_flush_compute_state() into radv_dispatch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
8c1ccb5394 radv: add radv_dispatch() helper
To share common dispatch compute code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Samuel Pitoiset
98f7e658a4 radv: add radv_emit_dispatch_packets() helper
To share common dispatch compute code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-20 10:27:32 +02:00
Dave Airlie
c4ac522511 ac/surface: handle S8 on gfx9
If we don't have a depth piece, we don't get a correct
swizzle mode and we hit an assert in addrlib.

In case of no depth get the preferrred swizzle mode for
stencil alone.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-20 15:32:05 +10:00
Krzysztof Sobiecki
39d539e321 egl: fix build fallouts from 1d0be5b3fe
Fixes: 1d0be5b3fe ("wayland-drm: constify the callbacks struct")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 21:27:01 +01:00
Jason Ekstrand
9ec51aa0fe anv: Advertise support VK_FORMAT_R8_SRGB
Unreal Engine 4 seems to really like this format for some reason.  We
don't technically have the hardware format but we do have L8_SRGB.  It's
easy enough to fake with that and a swizzle.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-19 12:06:30 -07:00
Jason Ekstrand
a8ba57d356 intel/blorp: Support clearing L8_UNORM_SRGB surfaces
Vulkan needs to be able to clear any texture you can create.  We want to
add support for VK_FORMAT_R8_SRGB and we need to use L8_UNORM_SRGB to do
that so we need to be able to clear it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-19 12:06:26 -07:00
Emil Velikov
4df0d50857 egl: use switch statements over if/else chain
Shorter, explicit and consistent with the rest of the co debase.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-19 19:08:41 +01:00
Emil Velikov
caf7fb627d egl: remove unneeded braces around since line if statements
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-19 19:08:34 +01:00
Emil Velikov
b94344f1c7 egl: simplify _eglDebugReport* API
Instead of having three, almost identical but not quite,
_eglDebugReport* functions, simply fold them into one.

While doing so drop the unnecessary arguments 'command' and
'objectLabel'. Former is identical to funcName, while the latter is
already stored (yet unused) in _EGLThreadInfo::CurrentObjectLabel.

Cc: Kyle Brenneman <kbrenneman@nvidia.com>
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (IRC)
2017-09-19 19:07:16 +01:00
Emil Velikov
5af2673479 egl: use _eglError's 'msg' as an actual message in EGL_KHR_debug
Seemingly, the original intent behind _eglError's 'msg' was aimed to
provide a function name.

At some point, people started using it the way EGL_KHR_debug's
callback() message is meant to be used. Aka providing meaningful
information to the developer/user.

Swap the funcName/msg argument order in the _eglDebugReport() call.
The 'funcName' variable is implicitly set, props to the
_eglSetFuncName() call at the start of each public entrypoint.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-19 19:07:12 +01:00
Emil Velikov
191402c0af automake: adjust wayland-drm comment
Vulkan does not depend on the library or any of the objects
created in the process.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
2785090a2a configure.ac: split the wayland client/server confusion
At the moment wayland-clients, such as the Vulkan drivers were
over-linking against libwayland-server.so.

That went unnoticed, since both client and server code uses the
wl*interface symbols, which are present in both libwayland-client.so and
libwayland-server.so.

I've looked at correcting that, although that's orthogonal to this fix.

Note: wayland-egl does _not_ depend on wayland-client, although it does
need wayland-egl.h. There's no distinct package that provides it (I have
a WIP on the topic) so current solution will do for now.

v2: Rebase with the "...inline wayland_drm_buffer_get" patch removed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
1d0be5b3fe wayland-drm: constify the callbacks struct
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
0007195d81 wayland-drm: add wl_display/wl_resource forward declarations
... making the header self-contained.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
fa6b9be22c configure.ac: define WL_HIDE_DEPRECATED at global scale
Due to GCC feature described in previous commit, the expected
deprecation warnings may be missing.

Set the WL_HIDE_DEPRECATED macro which will omit the deprecated
functionality, resulting in more distinct build issues.

That is safe since the symbols guarded within the macro is static.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Suggested-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Micah Fedke
be52bd17eb wayland-drm: avoid deprecated use of struct wl_resource
Wayland v1.2 with commit 1488c96a5db ("Add accessor functions for
wl_resource and deprecate wl_client_add_resource") paves the way towards
making wl_resource opaque.

Namely, new helpers were introduced and the struct was annotated as
deprecated.

Since wayland headers are normally installed in /usr/include, which is
in -isystem, GCC did not generate warnings as documented in the manual.
  "Warnings from system headers are normally suppressed..."

Signed-off-by: Micah Fedke <micah.fedke@collabora.co.uk>
Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
[Emil Velikov: add commit message]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
15a6ebdfbb wayland-drm: remove unused wayland_drm_buffer_get_{format,buffer}
Unused anywhere throughout the codebase. We could start using it,
although that contradicts to an evil plan* of mine.

* Only wayland servers will make use of the static library, providing
actual distinction between server vs client.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Emil Velikov
2f0342330c wayland-drm: remove hardcoded enum wl_drm_format
The exact same copy is generated in the client/server protocol header.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-19 19:02:34 +01:00
Eric Anholt
58cd67655c broadcom/genxml: Set up enums for VC5 blending, depth, stencil, and prims.
These will be used in tables in the Vulkan driver, and give us pretty CLIF
dump output.
2017-09-19 10:40:55 -07:00
Eric Anholt
af3c521528 broadcom/genxml: Add support for enum-typed fields.
This basically comes from the intel genxml script.  This will help improve
gdb and CLIF output once we convert fields over.
2017-09-19 10:40:55 -07:00
Juan A. Suarez Romero
d3a773611c intel: automake: add isl_genX_priv.h in the source list
Fixes:

 CC       isl/isl_format_layout.lo
In file included from
../../../../src/intel/isl/isl_storage_image.c:24:0:
../../../../src/intel/isl/isl_priv.h:170:29: fatal error:
isl_genX_priv.h: No such file or directory
compilation terminated.
Makefile:2936: recipe for target 'isl/isl_storage_image.lo' failed
make[5]: *** [isl/isl_storage_image.lo] Error 1
make[5]: *** Waiting for unfinished jobs....
In file included from ../../../../src/intel/isl/isl.c:36:0:
../../../../src/intel/isl/isl_priv.h:170:29: fatal error:
isl_genX_priv.h: No such file or directory
compilation terminated.
make[5]: *** [isl/isl.lo] Error 1
Makefile:2936: recipe for target 'isl/isl.lo' failed
make[4]: *** [all] Error 2

when running `make distcheck`.

v2: Fix commit title (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 19:23:46 +02:00
Juan A. Suarez Romero
88bf3b7715 vulkan: automake: add vk_android_native_buffer.h in the source list
Fixes:

  CCLD     libvulkan_wsi.la
ar: `u' modifier ignored since `D' is the default (see `U')
../../../../src/vulkan/util/vk_enum_to_str.c:26:45: fatal error:
vulkan/vk_android_native_buffer.h: No such file or directory
compilation terminated.
make[5]: *** [util/vk_enum_to_str.lo] Error 1

When running `make distcheck`.

v2: Fix commit title (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 19:23:45 +02:00
Ian Romanick
9ac8fece63 glsl: Unify ir_constant::const_elements and ::components
There was no reason to treat array types and record types differently.
Unifying them saves a bunch of code and saves a few bytes in every
ir_constant.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2017-09-19 12:02:43 -05:00
Ian Romanick
0e88153e99 glsl: Rename ir_constant::array_elements to ::const_elements
The next patch will unify ::array_elements and ::components, so the
name ::array_elements wouldn't be appropriate.  A lot of things use
the names array_elements and components, so grepping for either is
pretty useless.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2017-09-19 12:02:43 -05:00
Ian Romanick
e5145e28ea glsl: Silence unused parameter warnings
glsl/ast_type.cpp: In function ‘void merge_bindless_qualifier(YYLTYPE*, _mesa_glsl_parse_state*, const ast_type_qualifier&, const ast_type_qualifier&)’:
glsl/ast_type.cpp:189:35: warning: unused parameter ‘loc’ [-Wunused-parameter]
 merge_bindless_qualifier(YYLTYPE *loc,
                                   ^~~
glsl/ast_type.cpp:191:52: warning: unused parameter ‘qualifier’ [-Wunused-parameter]
                          const ast_type_qualifier &qualifier,
                                                    ^~~~~~~~~
glsl/ast_type.cpp:192:52: warning: unused parameter ‘new_qualifier’ [-Wunused-parameter]
                          const ast_type_qualifier &new_qualifier)
                                                    ^~~~~~~~~~~~~

glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* ir_rvalue::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:512:44: warning: unused parameter ‘mem_ctx’ [-Wunused-parameter]
 ir_rvalue::constant_expression_value(void *mem_ctx, struct hash_table *)
                                            ^~~~~~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* ir_texture::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:705:45: warning: unused parameter ‘mem_ctx’ [-Wunused-parameter]
 ir_texture::constant_expression_value(void *mem_ctx, struct hash_table *)
                                             ^~~~~~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* ir_assignment::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:851:48: warning: unused parameter ‘mem_ctx’ [-Wunused-parameter]
 ir_assignment::constant_expression_value(void *mem_ctx, struct hash_table *)
                                                ^~~~~~~
glsl/ir_constant_expression.cpp: In member function ‘virtual ir_constant* ir_constant::constant_expression_value(void*, hash_table*)’:
glsl/ir_constant_expression.cpp:859:46: warning: unused parameter ‘mem_ctx’ [-Wunused-parameter]
 ir_constant::constant_expression_value(void *mem_ctx, struct hash_table *)
                                              ^~~~~~~

glsl/linker.cpp: In function ‘void link_xfb_stride_layout_qualifiers(gl_context*, gl_shader_program*, gl_linked_shader*, gl_shader**, unsigned int)’:
glsl/linker.cpp:1655:60: warning: unused parameter ‘linked_shader’ [-Wunused-parameter]
                                   struct gl_linked_shader *linked_shader,
                                                            ^~~~~~~~~~~~~
glsl/linker.cpp: In function ‘void link_bindless_layout_qualifiers(gl_shader_program*, gl_program*, gl_shader**, unsigned int)’:
glsl/linker.cpp:1693:52: warning: unused parameter ‘gl_prog’ [-Wunused-parameter]
                                 struct gl_program *gl_prog,
                                                    ^~~~~~~

glsl/lower_distance.cpp: In member function ‘virtual void {anonymous}::lower_distance_visitor_counter::handle_rvalue(ir_rvalue**)’:
glsl/lower_distance.cpp:652:59: warning: unused parameter ‘rv’ [-Wunused-parameter]
 lower_distance_visitor_counter::handle_rvalue(ir_rvalue **rv)
                                                           ^~

glsl/opt_array_splitting.cpp: In member function ‘virtual ir_visitor_status {anonymous}::ir_array_reference_visitor::visit_leave(ir_assignment*)’:
glsl/opt_array_splitting.cpp:198:56: warning: unused parameter ‘ir’ [-Wunused-parameter]
 ir_array_reference_visitor::visit_leave(ir_assignment *ir)
                                                        ^~

glsl/glsl_parser_extras.cpp: In function ‘void assign_subroutine_indexes(gl_shader*, _mesa_glsl_parse_state*)’:
glsl/glsl_parser_extras.cpp:1869:45: warning: unused parameter ‘sh’ [-Wunused-parameter]
 assign_subroutine_indexes(struct gl_shader *sh,
                                             ^~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2017-09-19 12:02:43 -05:00
Juan A. Suarez Romero
c408c92d28 glsl: buffer variables can be readonly and writeonly
In GLSL ES 3.10 session 4.9 [Memory Access Qualifiers], it has the
following description:

"A variable could be qualified as both readonly and writeonly,
disallowing both read and write, but still be passed to
imageSize() to have the size queried.".

This is for image variable, but not for buffer variables.

According to https://github.com/KhronosGroup/OpenGL-API/issues/7 Khronos
intent is to allow both readonly and writeonly in buffer variables, and
as such it will update the GLSL specification.

This commit address this issue, and fixes:

KHR-GL{43,44,45}.shader_storage_buffer_object.basic-readonly-writeonly
KHR-GLES31.core.shader_storage_buffer_object.basic-readonly-writeonly

v2: set correctly fields[i] memory flags (Samuel Pitoiset).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-19 18:45:56 +02:00
Jason Ekstrand
1746671a76 Revert "i965: Reset miptree aux state on update_image_buffer"
This reverts commit e97f4b7480.
2017-09-19 06:55:32 -07:00
Zhongmin Wu
7343d27136 egl/android: Use per surface out fence
Use the plumbing introduced with previous patch to interact with the
Android framework.

Namely: currently we use an invalid fd of -1 for our calls to
ANativeWindow::{queue,cancel}Buffer.

At the same time applications (like flatland) may rely on it being
a valid one. Thus as they attempt to query the timestamp of the fence,
they get unexpected results/behaviour.

In the case of flatland - the benchmark hang inside getSignalTime().

Make use of the out fence and pass the correct fd to Android.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655
Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: split from larger patch]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 12:12:29 +01:00
Zhongmin Wu
e013ce8d0d egl: Allow creation of per surface out fence
Add plumbing to allow creation of per display surface out fence.

This can be used to implement explicit sync. One user of which is
Android - which will be addressed with next commit.

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
[Emil Velikov: reorder so there's no intermetent regressions, split]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 12:12:25 +01:00
Yogesh Marathe
0b3fc8f305 egl: Wrap dri3 surface primitive around dri2 egl surface
Originally dri3 egl surface was wrapped around _EGLSurface.

With next commit we'll add additional attributes, which will be checked
from generic code. Thus in order to access that we need to use
dri2_egl_surface.

The name of the latter is a misnomer - it should really be dri or
dri_common...

Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
[Emil Velikov: commit message, squash the patches appropriately, add
relevant _eglInitSurface hunk to prevent build breakage]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 12:12:25 +01:00
Alexandru-Liviu Prodea
c1b0137048 Scons: Add LLVM 5.0 support
1 new required library - LLVMBinaryFormat

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org=/show_bug.cgi?id=3D102318
Signed-off-by: Alexandru-Liviu Prodea <liviuprodea@yahoo.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-19 12:08:13 +01:00
Eric Engestrom
31237b054e radv: replace conditional compilation with MAYBE_UNUSED
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-19 09:46:18 +01:00
Eric Engestrom
fc7345415f glsl: replace conditional compilation with MAYBE_UNUSED
Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-19 09:46:08 +01:00
Eric Anholt
3752ad28f2 broadcom/vc4: Fix use-after-free when deleting a program.
By leaving the compiled shader in the context's stage state, the next
compile of a new FS would look in the old compiled FS for figuring out
whether to set various dirty flags for the VS compile.  Clear out the
pointer when deleting the program, and make sure that we always mark the
state as dirty if the previous program had been lost.  Fixes valgrind
warnings on glsl-max-varyings.

Fixes: 2350569a78 ("vc4: Avoid VS shader recompiles by keeping a set of FS inputs seen so far.")
2017-09-18 20:17:25 -07:00
Kenneth Graunke
b339d63f0d i965: Fix batch map failure check in INTEL_DEBUG=bat handling.
I originally wrote the code to call the maps 'batch' and 'state',
until I remembered that 'batch' is the intel_batchbuffer struct pointer.
The NULL check was still using the wrong variable.

Caught by Coverity.

CID: 1418109
2017-09-18 18:51:26 -07:00
Eric Anholt
4db9ad9893 broadcom/vc4: Fix crashes since the gallium blitter reworks.
Even if we're not clearing color, the blitter has started dereferencing
the color value.
2017-09-18 16:16:00 -07:00
Eric Anholt
9940fb4205 broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.
The blitter will bind just the depth buffer, which flushes the current job
if we had both a color and depth/stencil.  If the clear was doing partial
depth/stencil (quad-based) and color (tile-based), we'd go on to try to
set up the rest of the tile clear in the now flushed job.

Instead, move the partial clear up before we start setting up the job for
the current FBO state, and re-fetch the job if we're continuing on to a
tile-based clear.  Fixes valgrind failures in fbo-depthtex.

Fixes: 9421a6065c ("vc4: Fix fallback to quad clears of depth in GLX.")
2017-09-18 16:16:00 -07:00
Eric Anholt
d88a75182d broadcom/vc4: Fix use-after-free for flushing when writing to a texture.
I was trying to continue the hash table loop, not the inner loop.  This
tended to work out, because we would have *just* freed the job struct.
Fixes some valgrind failures in fbo-depthtex.

Fixes: f597ac3966 ("vc4: Implement job shuffling")
2017-09-18 16:15:58 -07:00
Eric Anholt
6e3d7a5916 ttn: Fix out-of-bounds accesses since the always-2D-constants change.
Only one of the three checks for dim was updated, so we would try to set a
UBO buffer index source value on a nir_load_uniform, and wouldn't actually
declare non-UBO uniforms.

Fixes: 37dd8e8dee ("gallium: all drivers should accept two-dimensional constant buffer indexing")
Tested-by: Derek Foreman <derekf@osg.samsung.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 16:14:27 -07:00
Chad Versace
9f763c1f9b anv/android: Disable surface and swapchain extensions (v2)
Android's Vulkan loader implements VK_KHR_surface and VK_KHR_swapchain,
and applications cannot access the driver's implementation. Moreoever,
if the driver exposes the those extension strings, then tests
dEQP-VK.api.info.instance.extensions and dEQP-VK.api.info.device fail
due to the duplicated strings.

v2: Replace !ANDROID with ANV_HAS_SURFACE. (for jekstrand)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:27:27 -07:00
Chad Versace
39c9d43536 anv: Feed vk_android_native_buffer.xml to generators (v2)
Feed the XML to anv_extensions.py and anv_entrypoints_gen.py.
Do it on all platforms, not just Android. Tested on Android and Fedora.

We always parse the Android XML, regardless of target platform, to
help reduce the chance that people working on non-Android break the
Android build.

v2:
  - Squash in Tapani's changes to Android.*.mk.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
2017-09-18 14:26:54 -07:00
Chad Versace
6a5ff18039 anv: Teach generator scripts how to parse mutliple XML files
The taught scripts are anv_extensions.py and anv_entrypoints_gen.py.  To
give a script multiple XML files, call it like so:

    anv_extensions.py --xml a.xml --xml b.xml --xml c.xml ...

The scripts parse the XML files in the given order.

This will allow us to feed the scripts XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Chad Versace
2d1fac119f vulkan/registry: Feed vk_android_native_buffer.xml to gen_enum_to_str.py
Tested on Android and Fedora.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Chad Versace
7f57e58e27 vulkan/util: Teach gen_enum_to_str.py to parse mutliple XML files
To give the script multiple XML files, call it like so:

    gen_enum_to_str.py --xml a.xml --xml b.xml --xml c.xml ...

The script parses the XML files in the given order.

This will allow us to feed the script XML files for extensions that are
missing from the official vk.xml, such as VK_ANDROID_native_buffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Chad Versace
7554fa266a vulkan/registry: Add VK_ANDROID_native_buffer
The VK_ANDROID_native_buffer extension is missing from the official
vk.xml. This patch defines the extension in a separate, minimal XML
file: vk_android_native_buffer.xml.

I chose to add the extension to a new XML file instead of adding it to
the official vk.xml in order to avoid conflicts each time we sync the
vk.xml from Khronos.

This should be only a temporary solution until Jesse Hall is persuaded
to add it to the official vk.xml.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Chad Versace
f07ac34e6f vulkan: Add #ifdef hack to vk_android_native_buffer.h
This patch consolidates many potential `#ifdef ANDROID` messes
throughout src/vulkan and src/intel/vulkan into a simple, localized
hack. The hack is an `#ifdef ANDROID` in vk_android_native_buffer.h
that, on non-Android platorms, avoids including the Android platform
headers and typedefs any Android-specific types to void*.

This hack doesn't remove *all* the `#ifdef ANDROID`s in upcoming
patches, but it does remove a lot.

I first tried implementing VK_ANDROID_native_buffer without this hack,
but eventually gave up when the yak shaving became too much.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Chad Versace
5872ccc9ac vulkan: Import vk_android_native_buffer.h
Just as Mesa imports the Khronos Vulkan headers, it should import this
Android-private Vulkan header too. This guarantees that Mesa will
continue to build even when upstream Android breaks header
compatibility.

This header is only for *implementers* of Vulkan, not for consumers of
Vulkan.

Imported from tag 'android-7.1.1_r28' in aosp/frameworks/native.

References: https://android.googlesource.com/platform/frameworks/native/+/android-7.1.1_r28/vulkan/include/vulkan/vk_android_native_buffer.h
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-18 14:26:54 -07:00
Jason Ekstrand
0699319bb0 i965: Use prepare_external instead of make_shareable in setTexBuffer2
The setTexBuffer2 hook from GLX is used to implement glxBindTexImageEXT
which has tighter restrictions than just "it's shared".  In particular,
it says that any rendering to the image while it is bound causes the
contents to become undefined.  This means that we can do whatever aux
tracking we want between glxBindTexImageEXT and glxReleaseTexImageEXT so
long as we always transition from external in Bind and to external in
Release.

The fact that we were using make_shareable before was a problem because
it would resolve away 100% of the aux data and then throw away our
reference to the aux buffer.  If the aux data was shared with some other
application (i.e. if we're using I915_FORMAT_MOD_Y_TILED_CCS) then we
would forget that the aux data even existed for the rest of eternity.
This is fine for the first frame but any subsequent calls to
glxBindTexImageEXT would bind the texture as if it has no aux
whatsoever and no resolves would happen and texturing would happen as if
there is no aux.  This was causing rendering corruption in mutter when
running on top of X11 with modifiers.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-09-18 12:17:59 -07:00
Jason Ekstrand
d80cbbeaff i965/tex_image: Reference the renderbuffer miptree in setTexBuffer2
The old code made a new miptree that referenced the same BO as the
renderbuffer and just trusted in the memory aliasing to work.  There are
only two ways in which the new miptree is liable to differ from the one
in the renderbuffer and neither of them matter:

 1) It may have a different target.  The only targets that we can ever
    see in intelSetTexBuffer2 are GL_TEXTURE_2D and GL_TEXTURE_RECTANGLE
    and the difference between the two doesn't matter as far as the
    miptree is concerned; genX(update_sampler_state) only looks at the
    gl_texture_object and not the miptree when determining whether or
    not to use normalized coordinates.

 2) It may have a very slightly different format.  Again, this doesn't
    matter because we've supported texture views for quite some time so
    we always look at the gl_texture_object format instead of the
    miptree format for hardware setup anyway.

On the other hand, because we were recreating the miptree, we were using
intel_miptree_create_for_bo which doesn't understand modifiers.  We
really want this function to work without doing a resolve so long as you
have modifiers so we need to fix that.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-09-18 12:16:55 -07:00
Jason Ekstrand
e97f4b7480 i965: Reset miptree aux state on update_image_buffer
When we get a miptree in through glxBindImageEXT, we don't know the
current aux state so we have to assume the worst-case.  If the image
gets recreated, everything is fine because miptreecreate_for_dri_image
sets it to the default.  However, if our miptree is recycled, then we
may have stale aux_usage and we need to reset to the default otherwise
our aux_state tracking will get messed up.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-09-18 12:16:50 -07:00
Jason Ekstrand
400ffa748e intel/isl: Add a drm_modifier_get_default_aux_state helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-09-18 12:14:24 -07:00
Kenneth Graunke
f3f42fd867 i965: Warn for GTT fallbacks when mapping the batch/state buffers.
This shouldn't really happen in practice, but I hit it a couple of times
when running a driver with a bad memory leak.  We may as well hook up
the warning, because if it ever triggers, we'll know something is wrong.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-18 09:49:10 -07:00
Kenneth Graunke
a2ef69a21d i965: Plumb brw through to intel_batchbuffer_reset().
We'll want to pass this to brw_bo_map in a moment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-18 09:48:55 -07:00
Marek Olšák
a5b764cfea radeonsi: reallocate if a non-sharable textures is being shared
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 17:47:49 +02:00
Marek Olšák
7b616f7b71 radeonsi: PIPE_BIND_SHARED should allow inter-process sharing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 17:47:49 +02:00
Nicolai Hähnle
f0233ac82d freedreno: compile fix
Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Reported-by: Jan Vesely <jan.vesely@rutgers.edu>
2017-09-18 17:39:20 +02:00
Jan Vesely
30741187c1 clover: add missing include to compat.h
Fixes build issues with llvm-3.6
Fixes: 3115687f9b (clover: Fix build after
LLVM r313390)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-18 16:32:09 +01:00
Jan Vesely
fdf0f1db22 clover: Query and export half precision support
v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
    has_halfs -> has_halves

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-18 10:45:02 -04:00
Jan Vesely
7b2c5547c3 gallium: Add PIPE_SHADER_CAP_FP16
Denotes native half precision float operations capability
v2: PIPE_CAP_HALFS -> PIPE_SHADER_CAP_FP16
    fix indentation

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 10:45:02 -04:00
Jason Ekstrand
1a994b053d anv: Implement VK_KHR_image_format_list
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-18 07:35:37 -07:00
Jason Ekstrand
52a89fedf2 anv: Implement VK_KHR_bind_memory2
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-09-18 07:33:59 -07:00
Benedikt Schemmer
c302f8fa7c nvc0: fix compile error
Fixes: 3f6b3d9db ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Benedikt Schemmer <ben@besd.de>
Previously-pointed-out-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-18 15:31:35 +02:00
Nicolai Hähnle
7a62f8621a radeonsi: allow out-of-order rasterization in commutative blending cases
We do not enable this by default for additive blending, since it slightly
breaks OpenGL invariance guarantees due to non-determinism.

Still, there may be some applications can benefit from white-listing
via the radeonsi_commutative_blend_add drirc setting without any real
visible artifacts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:20 +02:00
Nicolai Hähnle
8c56c45cd4 radeonsi: add drirc option "radeonsi_assume_no_z_fights"
This option enables a performance optimization where typical non-blending
draws with depth buffer may be rasterized out-of-order (on VI+, multi-SE
chips).

This optimization can lead to incorrect results when an applications
renders multiple objects with the same Z value at the same pixel, so we
will never enable it by default. But there may be applications that could
benefit from white-listing.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
aab134cfa5 radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs
This does not take commutative blending into account yet.

R600_DEBUG=nooutoforder disables it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
66d03d0e3e gallium/radeon: pass old_(perfect_)enable to set_occlusion_query_state
The callee can derive the current enable state itself.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-18 11:25:19 +02:00
Nicolai Hähnle
3f6b3d9db7 gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE
To be able to properly distinguish between GL_ANY_SAMPLES_PASSED
and GL_ANY_SAMPLES_PASSED_CONSERVATIVE.

This patch goes through all drivers, having them treat the two
query types identically, except:

1. radeon incorrectly enabled conservative mode on
   PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only
   on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE.
2. st/mesa uses the new query type.

Fixes dEQP-GLES31.functional.fbo.no_attachments.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
94736d31c3 amd/common: add workaround for cube map array layer clamping
Fixes dEQP-GLES31.functional.texture.filtering.cube_array.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
6772452e4c amd/common: remove has_ds_bpermute argument from ac_build_ddxy
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
3db86d86ed amd/common: add chip_class to ac_llvm_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
e0af3bed2c amd/common: round cube array slice in ac_prepare_cube_coords
The NIR-to-LLVM pass already does this; now the same fix covers
radeonsi as well.

Fixes various tests of
dEQP-GLES31.functional.texture.filtering.cube_array.combinations.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-18 11:25:18 +02:00
Nicolai Hähnle
6fb0c1013b radeonsi: workaround for gather4 on integer cube maps
This is the same workaround that radv already applied in commit
3ece76f03d ("radv/ac: gather4 cube workaround integer").

Fixes dEQP-GLES31.functional.texture.gather.basic.cube.rgba8i/ui.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-18 11:25:17 +02:00
Nicolai Hähnle
b7b4a14db5 st/glsl_to_tgsi: fix theoretical memory leak
It can't *really* happen since we don't use subroutines.

CID: 1417491
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
2017-09-18 11:25:17 +02:00
Iago Toral Quiroga
3d9cb39fd0 i965: emit BRW_NEW_AUX_STATE on aux state changes
Fixes a regression introduced with b96313c0e1, which removed
BRW_NEW_BLORP for a bunch of SURFACE_STATE setup code, including render
targets, on the basis that blorp invalidates binding tables but not
surface states, however, at least on Broadwell, this caused a regression
in a CTS test, which Ken and Jason tracked down to the fact that we
are not uploading new render target surface states after allocating
new CCS_D surfaces for fast clears (which allocation is deferred until
an actual clear occurs).

The reason this only fails in BDW is that on SKL+ we use CCS_E which
is allocated up front so it exists in the initial surface state, the
problem can be reproduced in these platforms too if we use
INTEL_DEBUG=norcb to force the CCS_D path.

This patch, together with the ones preceding it, fixes the regression
by ensuring that we track and flag as dirty all aux state changes.

Credit goes to Jason and Ken for figuring out the reason for the
regression.

Fixes:
KHR-GL45.transform_feedback.draw_xfb_test

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-18 10:47:51 +02:00
Iago Toral Quiroga
9a8bf42308 i965: emit BRW_NEW_AUX_STATE when we change the fast clear value
v2: rename intel_miptree_set_clear_value to intel_miptree_set_clear_color
    (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-18 10:47:51 +02:00
Iago Toral Quiroga
ca65b9e62d i965: emit BRW_NEW_AUX_STATE if we drop the aux surface
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-18 10:47:51 +02:00
Iago Toral Quiroga
5b27816b22 i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE
We want to use this flag to signal changes to the aux surfaces,
so let's not make it about fast clearing only. Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-18 10:47:51 +02:00
Emil Velikov
ac8ccf2543 docs: update calendar, add news item and link release notes for 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-18 00:16:42 +01:00
Emil Velikov
f55be0c0ef docs: add sha256 checksums for 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bd903d4ee1)
2017-09-18 00:13:43 +01:00
Emil Velikov
b7bfbfd1c5 docs: add release notes for 17.2.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d6d2b6b5ec)
2017-09-18 00:13:42 +01:00
Eric Engestrom
b959eeb4f6 docs: update sourcetree following omx rename
Fixes: 6a8aa11c20 "st/omx_bellagio: Rename state tracker and option"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-09-17 13:29:46 +01:00
Gert Wollny
e688a9ef6a gbm: Add gbm_device_get_format_modifier_plane_count to test
Adding gbm_device_get_format_modifier_plane_count made the
test gbm-symbols-check fail, this patch adds the according
function name to the test.

Fixes: 8824141b8d
 (gbm: Add a gbm_device_get_format_modifier_plane_count function)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-09-17 12:53:46 +03:00
Andres Gomez
7e8f03bfc0 travis: replace omx feature flag with omx-bellagio one
Fixes: 6a8aa11c20 ("st/omx_bellagio: Rename state tracker and
option")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-09-17 12:20:56 +03:00
Eric Engestrom
81557af63b docs/submittingpatches: add 'test each commit' instructions
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-09-17 00:21:31 +01:00
Bas Nieuwenhuizen
969537d935 radv: Add support for more DCC compression with VK_KHR_image_format_list.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-16 11:55:56 +02:00
Bas Nieuwenhuizen
d398db2acb radv: Add code to check if two formats can share DCC metadata.
Ported from radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-16 11:55:42 +02:00
Kenneth Graunke
4f8d1af0f6 i965: Add an INTEL_DEBUG=reemit option.
Jason and I use this for debugging all the time.  Recompiling the driver
to enable it is kind of annoying.  It's a great thing to try along with
always_flush_batch=true and always_flush_cache=true to detect a class of
problems - namely, atoms listening to an insufficient set of dirty bits.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-15 21:51:45 -07:00
Jan Vesely
3115687f9b clover: Fix build after LLVM r313390
v2: pass llvm context reference instead of a pointer

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-09-15 21:39:54 -04:00
Bas Nieuwenhuizen
5ef3c2bcef radv: Don't redundantly emit pipelines after secondary cmd buffer.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-15 23:12:25 +02:00
Bas Nieuwenhuizen
979978ee06 radv: Check for GFX9 for 1D arrays in image_size intrinsic.
Only on GFX9 we implement them as 2D images.

This fixes:
dEQP-VK.image.image_size.1d_array.readonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_7x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1
dEQP-VK.image.image_size.1d_array.writeonly_12x34
dEQP-VK.image.image_size.1d_array.writeonly_1x1
dEQP-VK.image.image_size.1d_array.writeonly_32x32
dEQP-VK.image.image_size.1d_array.writeonly_7x1

Fixes: 1bcb953e16 "radv: handle GFX9 1D textures"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-15 22:06:56 +02:00
Eric Engestrom
915dc6db45 i965: drop unused variables
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-15 12:09:13 -07:00
Jason Ekstrand
7bd5931cc1 i965/tex: Unify the TexImage and TexSubImage code
It's nearly the same so there's no good reason why it can't be in a
common function.  The one difference is that _mesa_store_teximage
calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage
doesn't, but we don't need that anyway - intelTexImage already does it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-09-15 10:59:05 -07:00
Jason Ekstrand
bb811fa828 i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy
It is set to false in both callers.  It isn't needed for glTexImage
because intelTexImage calls AllocTextureImageBuffer before calling
texsubimage_tiled_memcpy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-09-15 10:59:04 -07:00
Jason Ekstrand
6314dd13f7 i965/tex: Make a couple of helpers static
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-09-15 10:59:03 -07:00
Jason Ekstrand
82b3ca1981 i965: Move TexSubImage functions to intel_tex_image.c
These two paths are basically the same.  There's no good reason to have
them in different files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-09-15 10:58:58 -07:00
Jason Ekstrand
a43d379000 i965/blorp: Set r8stencil_needs_update when writing stencil
This fixes a crash on Haswell when we try to upload a stencil texture
with blorp.  It would also be a problem if someone tried to texture from
stencil after glBlitFramebuffers.

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-09-15 10:58:55 -07:00
Matt Turner
1bbe180873 util/u_atomic: Add implementation of __sync_val_compare_and_swap_8
Needed for 32-bit PowerPC.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 09:37:30 -07:00
Matt Turner
d075a4089e util: Link libmesautil into u_atomic_test
Platforms without particular atomic operations require the
implementations in u_atomic.c

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038b ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 09:37:30 -07:00
Lionel Landwerlin
5ff06ddf3b vulkan: update headers & registry to VK 1.0.61
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-15 08:56:40 -07:00
Emil Velikov
9aba643e3c automake: enable libunwind in `make distcheck'
Enable the toggle to catch when the library is missing from the link
path. Better to test, fail and address before releasing Mesa ;-)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-15 13:56:28 +01:00
Gert Wollny
39fe51c1e3 travis: Add libunwind-dev to gallium/make builds
libunwind is a optional dependency used by the gallium aux module
(libgallium) and consequently the final binaries must be linked against
it. To test whether the library is properly specified in the link pass
add it to the travis-ci build environment and force its use.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 13:56:28 +01:00
Gert Wollny
d3675812b5 travis: force llvm-3.3 for "make Gallium ST Other"
In Ubuntu Trusty the default version of llvm is 3.4 and the build was
actually randomly picking 3.5 or 3.9. Adding libunwind would then result
is build success or failure depending of what version was picked.

Install the llvm-3.3-dev package and force its use: On one hand it is
the minimum required version we want to the build test against, and on
the other hand forcing the version stabilizes the build.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 13:56:28 +01:00
Gert Wollny
c75d781610 mesa/st/tests: Correct build flags and force -std=c++11
Include src/gallium/Automake.inc, correct the build flags accordingly.

Force -std=c++11 (extensively used by the test) as otherwise it gets
defined only when building against llvm >= 3.9.

Fixes: 7be6d8fe12  ("mesa/st: glsl_to_tgsi: add tests for the new
temporary lifetime tracker")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2017-09-15 13:56:28 +01:00
Emil Velikov
3c5fb7346f automake: include radv_shader.h in the sources list
Otherwise it will be missing from the tarball, leadin to build failure.

Fixes: d4d777317b ("radv: move shaders related code to radv_shader.c")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-15 13:56:27 +01:00
Gurkirpal Singh
6a8aa11c20 st/omx_bellagio: Rename state tracker and option
Changes --enable-omx option to --enable-omx-bellagio

Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-09-15 14:28:36 +02:00
Tapani Pälli
acbfcb7105 i965: fix build warning on clang
fixes following warning:
   warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long')

cast is needed to avoid this change turning in to another warning:
   warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long')

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-15 12:39:33 +03:00
Samuel Pitoiset
8e8c7c6703 radv: fix a potential crash if attachments allocation failed
Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:38 +02:00
Samuel Pitoiset
a0495d4bb3 radv: dump the device name into the hang report
Similar to RadeonSI renderer string.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:35 +02:00
Samuel Pitoiset
176c2ad10c radv: add get_chip_name() callback
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-15 09:16:34 +02:00
Dave Airlie
1b163238f5 r600: add .gitignore for egd_tables.h 2017-09-15 13:55:01 +10:00
Timothy Arceri
a70a401f52 radeonsi: enable STD430 packing of UBOs by default
Before this change we were defaulting to STD140 which is slightly
less efficient at packing arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
fac9f2c4b0 st/mesa: set UseSTD430AsDefaultPacking const based on CAP
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
c96e45ebf0 gallium: introduce PIPE_CAP_LOAD_CONSTBUF
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
b4401cc104 radeonsi: make use of LOAD for UBOs
v2: always set can_speculate and allow_smem to true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
51cf16319d mesa/st: add LOAD support for UBOs
This will allow us to use STD430 packing by default if the driver
supports it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:55 +10:00
Timothy Arceri
ee0fbc8b71 mesa/st: create add_buffer_to_load_and_stores() helper
Will be used to add LOAD support to UBOs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:54 +10:00
Timothy Arceri
6fa60b5e40 gallium: add CONSTBUF type to tgsi_file_type
This will be use to distinguish between load types when using
the TGSI_OPCODE_LOAD opcode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-15 11:42:54 +10:00
Dave Airlie
b6f6ead198 virgl: drop const dimensions on first block.
The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com>
for also writing nearly the same patch.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Tested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-15 10:33:14 +10:00
Dave Airlie
a7a7bf21bd st/glsl->tgsi: fix u64 to bool comparisons.
Otherwise we end up using a 32-bit comparison which didn't end well.

Timothy caught this while playing around with some opt passes.

Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-15 09:49:50 +10:00
Kenneth Graunke
62f2670cba i965: Print size of validation and relocation lists in INTEL_DEBUG=flush
It's nice to have this information.  While we're at it, tweak the
formatting to try and vertically align numbers in the common case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
7c5988e615 i965: Disentangle batch and state buffer flushing.
We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).

With this change, we also need to update our "are we near the end?"
estimate to require separate batch and state buffer space.  I obtained
these estimates by looking at the size of draw calls in the Unreal 4
Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).

This will significantly impact the size of our batches.  I've adjusted
both down to try and be roughly similar to what we had been doing.  On
various benchmarks, a 20kB batch and 16kB statebuffer seemed to about
right, but we may need to adjust this further.  I tried a 16kB batch,
but that regressed Synmark OglMultithread performance by a fair bit.
32kB for both would have significantly increased our batch sizes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
2c46a67b41 i965: Delete BATCH_RESERVED handling.
Now that we can grow the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
9034d157c0 i965: Make BLORP properly avoid batch wrapping.
We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
2dfc119f22 i965: Grow the batch/state buffers if we need space and can't flush.
Previously, we would just assert fail and die in this case.  The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.

Growing is fairly straightforward:

1. Allocate a new larger BO.
2. memcpy the existing contents over to the new buffer
3. Set the new BO to the same GTT offset as the old BO.  When emitting
   relocations, we write the presumed GTT offset of the target BO.  If
   we changed it, we'd have to update all the existing values (by
   walking the relocation list and looking at offsets), which is more
   expensive.  With the old BO freed, ideally the kernel could simply
   place the new BO at that offset anyway.
4. Update the validation list to contain the new BO.
5. Update the relocation list to have the GEM handle for the new BO
   (which we can skip if using I915_EXEC_HANDLE_LUT).

v2: Update to handle malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
78c404f106 i965: Use a separate state buffer, but avoid changing flushing behavior.
Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer.  We then
flushed before the two met in the middle.

Meeting in the middle is fatal, so you have to be certain that you
reserve the correct amount of space before emitting commands or state
for a draw.  Currently, we will assert !no_batch_wrap and die if the
estimate is ever too small.  This has been mercifully obscure, but has
happened on a number of occasions, and could in theory happen to any
application that issues a large draw at just the wrong time.

Estimating the amount of batch space required is painful - it's hard to
get right, and getting it right involves a lot of code that would burn
CPU time, and also be painful to maintain.  Rolling back to a saved
state and retrying is also painful - failing to save/restore all the
required state will break things, and redoing state emission burns a
lot of CPU.  memcpy'ing to a new batch and continuing is painful,
because commands we issue for a draw depend on earlier commands as well
(such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state).

The best plan is to never run out of space, which is totally doable but
pretty wasteful - a pessimal draw requires a huge amount of space, and
rarely occurs.  Instead, we'd like to grow the batch buffer if we need
more space and can't safely flush.

We can't grow with a meet in the middle approach - we'd have to move the
state to the end, which would mean updating every offset from dynamic
state base address.  Using separate batch and state buffers, where both
fill starting at the beginning, makes it easy to grow either as needed.

This patch separates the two concepts.  We create a separate state
buffer, with a second relocation list, and use that for brw_state_batch.

However, this patch tries to retain the original flushing behavior - it
adds the amount of batch and state space together, as if they were still
co-existing in a single buffer.  The hope is to flush at the same time
as before.  This is necessary to avoid provoking bugs caused by broken
batch wrap handling (which we'll fix shortly).  It also avoids suddenly
increasing the size of the batch (due to state not taking up space),
which could have a significant performance impact.  We'll tune it later.

v2:
- Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught
  by Chris).  Unfortunately, we lose the ability to capture state data
  on older kernels.
- Continue to support the malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
0bf3fa4c53 i965: Pass screen to intel_batchbuffer_reset().
This will let us access screen->kernel_features in the next patch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
2e68c4e454 i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.
We'll need to read from both buffers when decoding state.

This also drops the "failed to map" fallback - it's completely useless
on LLC systems where we write directly to the mapped BO.  It's not that
useful on non-LLC systems either.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
e723255901 i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.
brw_batch_reloc emits a relocation from the batchbuffer to elsewhere.
brw_state_reloc emits a relocation from the statebuffer to elsewhere.

For now, they do the same thing, but when we actually split the two
buffers, we'll change brw_state_reloc to use the state buffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
1674a0bcbc i965: Refactor relocs into a brw_reloc_list structure.
I'm planning on splitting batch and state into separate buffers, at
which point we'll need two relocation lists.  In preparation for that,
this patch refactors the relocation stuff into a structure we can
replicate...which looks a lot like anv_reloc_list.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
1bc44e0e7f i965: Move brw_state_batch code to intel_batchbuffer.c
The batch buffer and state buffer code is fairly tied together,
and having it in one .c file will make refactoring easier.

Also, drop some commentary above brw_state_batch.  The "aperture
checking performance hacks" are long since gone, so that paragraph
makes little sense at this point.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
3b812e62a1 i965: Drop a useless ret == 0 check.
Prior to the previous patch, we would pwrite the batchbuffer contents,
and wanted to skip the execbuffer if that failed.  Now that we memcpy,
we don't set ret != 0 on failure anymore, so it will always be 0.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
717e753912 i965: Use a WC map and memcpy for the batch instead of pwrite.
We'd like to eliminate the malloc'd shadow copy eventually, but there
are still unresolved performance problems.  In the meantime, let's at
least get rid of pwrite.

On Apollolake, improves Synmark OglBatch6 performance by:
1.53581% +/- 0.269589% (n=108).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
343aa09a22 i965: Use batch->bo->size in brw_emit_reloc assertion.
This makes the assertion safe against batchbuffers growing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Kenneth Graunke
d124521141 i965: Delete a batch size assertion that isn't very useful.
This assertion prevents you from doing intel_batchbuffer_require_space
with a size so huge it won't fit in the batchbuffer.  This doesn't seem
like a common mistake, and I've never seen the assert to be useful.

Soon, I hope to have batches grow, at which point this won't make sense.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-14 16:17:36 -07:00
Jason Ekstrand
939b53d332 i965/screen: Implement queryDmaBufFormatModifierAttirbs
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:42 -07:00
Jason Ekstrand
9c52aef7d7 i965/screen: Report the correct number of image planes
For non-CCS images, we were reporting just one plane even though they
may have multiple in the case of YUV.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:40 -07:00
Jason Ekstrand
8824141b8d gbm: Add a gbm_device_get_format_modifier_plane_count function
This allows the user to query the number of planes required by a given
format+modifier combination without having to create a bo or surface.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:39 -07:00
Jason Ekstrand
0a25a417ce dri/image: Add a format modifier attributes query
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-14 14:47:18 -07:00
Christoph Berliner
7ffd4d2a66 drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-14 21:02:36 +02:00
Iago Toral Quiroga
98141366f9 glsl: avoid accessing invalid memory after get_variable_being_redeclared()
After get_variable_being_redeclared() has been called, it is no longer
safe to access the original variable pointer, since its memory might have
been freed.

Since callers of this function should only be accessing the variable pointer
returned by the function, avoid potential bugs by re-assigning the
original variable pointer to the result of the function call,
making it impossible for the remaining code to access an invalid variable
pointer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Iago Toral Quiroga
a7017746d7 glsl: make the redeclared variable NULL if it is deleted
get_variable_being_redeclared() can delete the original variable
in a specific scenario. The code sets it to NULL after this so other
code in that same function doesn't try to access trashed memory after
the fact, however, the copy of that variable in the caller code
won't see any of this making it very easy to overlook.

Make the function a bit safer by taking a pointer to the original
variable so we can also make NULL the caller's pointer to the variable
if this function deletes it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Iago Toral Quiroga
4af156224e glsl: use 'declared_var' instead of 'var' after checking redeclarations
Since the original 'var' might have been deleted from this point forward.

Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=102685
Fixes: 51bf007d2c (glsl: Disallow unsized array of atomic_uint)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-14 11:23:26 +02:00
Eric Engestrom
412ab3f6fd dri/radeon: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-14 09:56:00 +01:00
Samuel Pitoiset
49c72d84c2 radv: dump the list of enabled options when a hang occured
Useful to know which debug/perftest options were enabled when
a hang report is generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
302e34d24b radv: dump last 60 lines of dmesg when a hang occured
Copied from dd_dump_dmesg().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
26bc664ca0 radv: dump descriptors when a hang occured
Might be useful for checking if all descriptors are sets by
the application.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
b3c8de1c55 radv: save all descriptor pointers into the trace BO
To dump them when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
d7f2430703 radv: dump annotated shaders using UMR
This might be very useful in order to figure out where a shader
is stucked. This uses UMR to detect which instruction is executing
bad things.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
f0d09d9012 radeonsi: move si_get_wave_info() to AMD common code
This will allow us to use it from radv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
8181427b14 radv: dump some status MMIO registers when a hang occured
Might report some useful information to help figuring out where
does the hang happened.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
140621f7c4 radv/winsys: add a read_registers() callback
To dump some status MMIO registers when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
6d957a86ff radv: dump shader stats when a hang occured
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
80b8d9f7e7 radv: add radv_shader_dump_stats() helper
To dump the shader stats when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
d28cbf6f9e radv: dump the active shaders when a hang occured
Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
e2e72477c0 radv: add debug flags for syncing shaders after every draw call
To improve GPU hangs detection when shaders are stucked.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
061f5b7d73 radv: add radv_cmd_buffer_after_draw() helper function
To share common code after every draw/compute calls.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
bcf7698211 radv: save the bound pipeline pointers into the trace BO
When a GPU hang is detected in radv_gpu_hang_occured() we know
which command buffer is faulty but the bound pipelines might
have been updated during the execution.

The pointers to the radv_pipeline objects are emitted just
after the second trace ID, that way it would be easy to dump
the active shaders at the moment of the hang.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
3c61c99ed5 radv: add a comment that describes the trace BO layout
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-14 10:37:57 +02:00
Samuel Pitoiset
4224b31bf3 radv: initialize the trace BO to 0
To avoid random initial values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-14 10:37:57 +02:00
Eric Engestrom
396d2dbce4 swr: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-14 09:36:01 +01:00
Jeremy Huddleston Sequoia
e7ef901650 mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-13 19:48:58 -07:00
Denis Pauk
74d2456491 gallium/{r600, radeonsi}: Fix segfault with color format (v2)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552

v2: Patch cleanup proposed by Nicolai Hähnle.
    * deleted changes in si_translate_texformat.

Cc: Nicolai Hähnle <nhaehnle@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-09-14 00:59:24 +02:00
Kenneth Graunke
edfd8d42a9 i965: Add an INTEL_DEBUG=submit option for printing batch statistics.
When a batch is submitted, INTEL_DEBUG=bat prints a message indicating
which part of the code triggered the flush, and some statistics about
the batch/state buffer utilization.

It also decodes the batchbuffer in debug builds...which is so much
output that it drowns out the utilization messages, if that's all you
care about.

INTEL_DEBUG=submit now just does the utilization messages.
INTEL_DEBUG=bat continues to do both (as the message is a good indicator
that we're starting decode of a new batch).

v2: Rename from "flush" to "submit" (suggested by Chris) because we
    might want "flush" for PIPE_CONTROL debugging someday.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-13 13:52:38 -07:00
Dave Airlie
64d9bd149a radv/nir: call opt_remove_phis after trivial continues.
With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 21:13:03 +01:00
Chad Versace
f9412a4e75 util/build_id: Include <dlfcn.h>
Fix the build for Android Nougat.

The dladdr(3) manpage says that <dlfcn.h> is required. On Linux, the
build succeeded without it because build_id.c includes <link.h> which
includes <dlfcn.h>. On Android, we must include <dlfcn.h> directly.

Fixes: 5c98d382 "util: Query build-id by symbol address, not library name"
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-13 12:43:42 -07:00
Chad Versace
5c98d3825c util: Query build-id by symbol address, not library name
This patch renames build_id_find_nhdr() to
build_id_find_nhdr_for_addr(), and changes it to never examine the
library name.

Tested on Fedora by confirming that build_id_get_data() returns the same
build-id as the file(1) tool. For BSD, I confirmed that the API used
(dladdr() and struct Dl_info) is documented in FreeBSD's manpages.

This solves two problems:

    - We can now the query the build-id without knowing the installed library's
      filename.

      This matters because Android requires specific filenames for HAL
      modules, such as "/vendor/lib/hw/vulkan.${board}.so". The HAL
      filenames do not follow the Unix convention of "libfoo.so".  In
      other words, the same query code will now work on Linux and Android.

    - Querying the build-id now works correctly when the process
      contains multiple shared objects with the same basename.
      (Admittedly, this is a highly unlikely scenario).

Cc: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-13 09:49:27 -07:00
Nicolai Hähnle
c8db134e4d st/glsl_to_tgsi: remove unused code in temprename
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-13 18:28:29 +02:00
Nicolai Hähnle
55ca12be9d st/glsl_to_tgsi: be precise about merging scopes
enclosing_scope already contains enclosing_scope_first_read.
What we really want to check here -- not for correctness, but
for speed -- is whether last_read_scope already contains
enclosing_scope.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-09-13 18:28:11 +02:00
Nicolai Hähnle
cffc0ae0d9 ac/surface: match Z and stencil tile config
Fixes various piglit tests on Stoney, see the comment.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:27:01 +02:00
Nicolai Hähnle
481df8032b ac/surface: sanity-check that we got a TC-compatible HTILE if requested
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:26:59 +02:00
Nicolai Hähnle
b2b0702868 ac/addrlib: enable assertions in debug builds
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:26:56 +02:00
Nicolai Hähnle
113ecc2bfa ac/addrlib: relax an assertion
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:26:54 +02:00
Nicolai Hähnle
b0ee0e0860 ac/addrlib: relax an assertion
This assertion is triggered on Stoney in Piglit
./bin/framebuffer-blit-levels {draw,read} stencil -auto -fbo
and similar tests. It should be harmless -- just relax it until
we can get internal clarification.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:26:51 +02:00
Nicolai Hähnle
e4af4433fc radeonsi: hard-code pixel center for interpolateAtSample without multisample buffers
The GLSL rules for interpolateAtSample are unfortunate:

   "Returns the value of the input interpolant variable at
    the location of sample number sample. If
    multisample buffers are not available, the input
    variable will be evaluated at the center of the pixel.
    If sample sample does not exist, the position used to
    interpolate the input variable is undefined."

This fix will fallback to monolithic shader compilation when
interpolateAtSample is used without multisampling.

One alternative would be to always upload 16 sample positions,
filling the buffer up with repetition when the actual number of
samples is less, and then ANDing the sample ID with 0xf. However,
that punishes all well-behaving users of interpolateAtSample,
when in reality, only conformance tests should be affected by
the issue.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:45 +02:00
Nicolai Hähnle
92c4277990 radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog
gl_SampleMaskIn is supposed to contain set bits only for the samples that
are covered by the current fragment shader invocation, but the VGPR
initialization hardware loads the set of all bits that are covered at the
current pixel.

Fixes various tests in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:41 +02:00
Nicolai Hähnle
792724a337 radeonsi: remove SET_PREDICATION workaround on newer firmware
We need to keep the workaround for older firmware, though.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:08 +02:00
Nicolai Hähnle
b8c6e88848 amd/common: get ME/PFP/CE firmware feature versions as well
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:25:06 +02:00
Nicolai Hähnle
8d8f1ef573 radeonsi: rename variable to clarify its meaning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
48b3364b5b radeonsi: make si_init_shader_selector_async static
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
7e4344151f radeonsi: fix segfault in descriptor dumping
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Nicolai Hähnle
81f398dcb1 ddebug: write out final driver log messages with GALLIUM_DDEBUG=always
If the last operation happens to be a non-draw, such as a
transfer_map that triggers a decompress blit, there may be
interesting messages left in the driver log.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-13 18:24:18 +02:00
Tim Rowley
000e2958f5 swr/rast: Fetch compile state changes
Add InstanceStrideEnable field and rename InstanceDataStepRate to
InstanceAdvancementState in INPUT_ELEMENT_DESC structure.

Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices()
and FetchJit::JitGatherVertices() and assert if they are triggered.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:54 -05:00
Tim Rowley
ead0dfe31e swr/rast: adjust linux cpu topology identification code
Make more robust to handle strange strange configurations like a vmware
exported 4-way numa X 1-core configuration.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:47 -05:00
Tim Rowley
1ccf9ad280 swr/rast: Missed conversion to SIMD_T
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:41 -05:00
Tim Rowley
c0ce5c4422 swr/rast: whitespace changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:35 -05:00
Tim Rowley
6b9e801832 swr/rast: add graph write to jit debug putput
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:30 -05:00
Tim Rowley
6f0fcec07a swr/rast: Migrate memory pointers to gfxptr_t type
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:24 -05:00
Tim Rowley
ae2412dbbd swr/rast: Remove hardcoded clip/cull slot from clipper
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:18 -05:00
Tim Rowley
5471f65976 swr/rast: Start to remove hardcoded clipcull_dist vertex attrib slot
Add new field in SWR_BACKEND_STATE::vertexClipCullOffset to specify the
start of the clip/cull section of the vertex header.  Removed use of
hardcoded slot from binner.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:11 -05:00
Tim Rowley
9669972692 swr/rast: Move clip/cull enables in API
Moved from from SWR_RASTSTATE to SWR_BACKEND_STATE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:09:04 -05:00
Tim Rowley
f5031fb952 swr/rast: Add new API SwrStallBE
SwrStallBE stalls the backend threads until all work submitted before
the stall has finished.  The frontend threads can continue to make
forward progress.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-13 10:08:46 -05:00
Eric Engestrom
2f6ffab1ce glsl: compile unused function out
The function is only called from one place, which is hidden behind
the same `#ifdef DEBUG`.

Fixes: ca73c3358c "glsl: Mark functions static"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-13 11:22:27 +01:00
Eric Engestrom
c0b81af0dc radv: compile out unused code
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-13 11:19:30 +01:00
Samuel Pitoiset
375c4868ef radv: clear push_constant_stages when resetting a command buffer
Per the spec:

   "Resetting a command buffer is an operation that discards any
   previously recorded commands and puts a command buffer in the
   initial state."

As far I'm concerned, that flag can be changed by calling
VkCmdPushConstants() (or any other functions which update it),
so it should be cleared as well.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-13 09:47:45 +02:00
Samuel Pitoiset
ef197ead75 radv: add more radv_emit_XXX() helpers for the dynamic state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-13 09:47:43 +02:00
Samuel Pitoiset
ce218c31eb radv: remove useless 'cmd_buffer' param from radv_buffer_view_init()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-13 09:47:41 +02:00
Dave Airlie
3633bae36b radv/gfx9: fix image resource handling.
GFX9 changes how images are layed out, so this needs updating.

Fixes: dEQP-VK.query_pool.statistics_query.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 17:17:07 +10:00
Dave Airlie
aba441be44 radv/ac: bump params array for image atomic comp swap
For the comp_swap case this was overflowing and crashing
sometimes.

Fixes:
dEQP-VK.image.atomic_operations.compare_exchange.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 17:17:02 +10:00
Dave Airlie
ebd2a5354d radv/gfx9: set mip0-depth correctly for 2d arrays/3d images
This field covers the whole resource.

Fixes:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.view_type.3d.format.*
dEQP-VK.texture.filtering.3d.combinations.*

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 17:16:52 +10:00
Dave Airlie
1bcb953e16 radv: handle GFX9 1D textures
As GFX9 can't handle 1D depth textures, radeonsi and
apparantly pro just update all 1D textures to 2D,
and work around it.

This ports the workarounds from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 08:40:41 +10:00
Dave Airlie
2f5b4490b5 radv: don't use iview for meta image width/height.
Work out the width/height from the level manually, as on GFX9
we won't minify the iview width/height.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image* on gfx9

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-13 08:40:18 +10:00
Jason Ekstrand
d496780fb2 intel/eu/validate: Look up types on demand in execution_type()
We are looking up the execution type prior to checking how many sources
we have.  This leads to looking for a type for src1 on MOV instructions
which is bogus.  On BDW+, the src1 register type overlaps with the
64-bit immediate and causes us problems.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-12 15:01:00 -07:00
Marek Olšák
4ba20c9473 Revert "winsys/amdgpu: disable local BOs on Raven"
This reverts commit 1cda9a2fee.

It works now.
2017-09-12 22:44:02 +02:00
Bas Nieuwenhuizen
1a172fb113 radv: Don't allocate CMASK for linear images.
We can't use it anyway in fast clears, and on GFX9 it seems to
actually hange the card if we specify it.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-09-12 22:06:55 +02:00
Bas Nieuwenhuizen
bee83b2661 radv: Disable multilayer & multilevel DCC.
The current DCC init routine doesn't account for initializing a
single layer or level. Multilayer seems hard for small textures on
pre-GFX9 as tre metadata for the layers can be interleaved. For
GFX9 multilevel textures are a problem for similar reasons.

So just disable this for now, until we handle the texture modes
correctly.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-09-12 22:06:24 +02:00
Kenneth Graunke
e9cf458fa8 docs: Document shader capturing environment variables.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-12 09:27:09 -07:00
Eric Engestrom
85b66d2096 docs/egl: remove reference to EGL_DRIVERS_PATH
Support for external egl drivers was dropped a few years ago.

Fixes: 209360bbb9 "egl/main: drop support for external egl drivers"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 16:14:04 +01:00
Eric Engestrom
d861eb5fc2 util/disk_cache: turn MESA_GLSL_CACHE_DISABLE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the docs and tests, use `DISABLE=true` instead of `DISABLE=1` as it's
clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
717fb6e4be glx: turn LIBGL_NO_DRAWARRAYS into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the docs, use `NO_DRAWARRAYS=true` instead of `NO_DRAWARRAYS=1` as it's
clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
d2768a397d glx: turn LIBGL_PROFILE_CORE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
3fdbc46b42 glx: turn LIBGL_DUMP_VISUALID into a boolean
Instead of setting based on set/unset, allow users to use boolean values.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
14e431b270 egl+glx: turn LIBGL_DRI3_DISABLE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the docs, use `DISABLE=true` instead of `DISABLE=1` as it's clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
177fd320d6 glx: turn LIBGL_ALWAYS_INDIRECT into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the docs, use `ALWAYS=true` instead of `ALWAYS=1` as it's clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:12 +01:00
Eric Engestrom
43e2d58698 glx: turn LIBGL_ALLOW_SOFTWARE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the help string, use `ALLOW=true` instead of `ALLOW=1` as it's clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:11 +01:00
Eric Engestrom
5c68ea29f3 egl+glx: turn LIBGL_ALWAYS_SOFTWARE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.
In the docs, use `ALWAYS=true` instead of `ALWAYS=1` as it's clearer IMO.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:11 +01:00
Eric Engestrom
f4a9d205d8 glx: turn LIBGL_DIAGNOSTIC into a boolean
Instead of setting based on set/unset, allow users to use boolean values.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:11 +01:00
Eric Engestrom
6ea8db5b4c gbm: turn GBM_ALWAYS_SOFTWARE into a boolean
Instead of setting based on set/unset, allow users to use boolean values.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-12 13:53:11 +01:00
Tapani Pälli
f940b1665a anv: fix build issues on release build
Fixes: d083bc1c4b ("anv: wire up vk_errorf macro to do debug reporting")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-09-12 13:29:11 +03:00
Iago Toral Quiroga
51bf007d2c glsl: Disallow unsized array of atomic_uint
This was a bugfix to the spec addressed in OpenGL 4.5 (revision
7 of the spec) and there is a CTS test to check this.

Fixes:
KHR-GL45.shader_atomic_counters.negative-unsized-array

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-09-12 09:16:05 +02:00
Tapani Pälli
ea314bf812 anv: remove extra 'debug:' from anv_debug_ignored_stype
anv_debug adds 'debug:' already, this is to clean following:
   debug: debug: anv_CreateDebugReportCallbackEXT: ignored VkStructureType 1000011000

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:42:19 +03:00
Tapani Pälli
a7ebb21744 anv: move brw_process_intel_debug_variable to happen early
Currently anv_perf_warn call in anv_compute_heap_size does not ever
report a perf warning. Move debug variable read as the first thing
in case there will be other perf_warn calls added.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:42:11 +03:00
Tapani Pälli
d083bc1c4b anv: wire up vk_errorf macro to do debug reporting
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:42:00 +03:00
Tapani Pälli
73638be11f anv: wire up anv_perf_warn macro to do debug reporting
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:41:10 +03:00
Tapani Pälli
086cfa5652 anv: implementation of VK_EXT_debug_report extension
Patch adds required functionality for extension to manage a list of
application provided callbacks and handle debug reporting from driver
and application side.

v2: remove useless helper anv_debug_report_call
    add locking around callbacks list
    use vk_alloc2, vk_free2
    refactor CreateDebugReportCallbackEXT
    fix bugs found with crucible testing

v3: provide ANV_FROM_HANDLE and use it
    misc fixes for issues Jason found
    use vk_find_struct_const for finding ctor_cb

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 09:39:29 +03:00
Iago Toral Quiroga
ab6f874439 i965: do not fallback to linear tiling for stencil surfaces
We were skipping this fallback for depth, but not for stencil
which the hardware always requires to be W-tiled.

Also, make the checks for whether we need to apply retiling
strategies based on usage instead of tiling flags, which is
safer and more explicit.

This fixes a regression in a CTS test introduced with commit
4ea63fab77 that started applying re-tiling stencil surfaces
in certain scenarios.

v2: discard retiling based on usage fields instead of tiling
    flags. This is safer and more explicit.

v3: Add a comment indicating that texturing of stencil in gen7
    requires an Y-tiled copy (Topi).

Fixes:
KHR-GL45.direct_state_access.renderbuffers_storage

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-09-12 08:09:45 +02:00
Juan A. Suarez Romero
806ae6a648 nir/spirv: handle if's with same label in both branches
When a conditional branch has the same labels in the "if" part and in the
"else" part, then we have the same cfg block, and it must be handled
once.

v2: handle it the same way as OpBranch (Jason).

Fixes:
dEQP-VK.spirv_assembly.instruction.compute.conditional_branch.same_labels*
dEQP-VK.spirv_assembly.instruction.graphics.conditional_branch.same_labels*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-12 07:01:40 +02:00
Aaron Watry
5666d3e3e5 mesa/st: Include builddir/src/compiler/glsl to fix make check
Otherwise, when doing an out-of-tree build you can expect the following:

make[6]: Entering directory \
         '${MESA_SRC}/build/src/mesa/state_tracker/tests'
  CXX      test_glsl_to_tgsi_lifetime.o
In file included from \
    ${MESA_SRC}/src/mesa/src/mesa/state_tracker/st_glsl_to_tgsi_private.h:31:0,
  from \
    ${MESA_SRC}/src/mesa/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.h:27,
  from \
    ${MESA_SRC}/src/mesa/src/mesa/state_tracker/tests/test_glsl_to_tgsi_lifetime.cpp:24:
  ${MESA_SRC}/src/compiler/glsl/ir.h:1502:37: \
    fatal error: ir_expression_operation.h: No such file or directory
 #include "ir_expression_operation.h"

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
2017-09-11 20:18:18 -05:00
Dave Airlie
f2d0f587ca radv: work out a base ia_multi_vgt_param.
This just reduces the calculations a bit further.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:15 +01:00
Dave Airlie
ded1dbfd96 radv: calculate non-draw related ia_multi_vgt_param bits in pipeline
This moves a bunch of non-draw dependent calcs into the pipeline code,
to reduce CPU overheads in the draw path.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:15 +01:00
Dave Airlie
d2490eb2d1 radv: move calculating primgroup_size to pipeline.
This moves this out of the draw paths.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:15 +01:00
Dave Airlie
16eac0a756 radv: only calculate num_prims when required.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:15 +01:00
Dave Airlie
6cc545b212 radv: use upload_data to upload push descriptors.
This is just a reusing code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:15 +01:00
Dave Airlie
1dbcfd2941 radv: realign vgt flush on hawaii workaround with radeonsi.
This realigns this code with the radeonsi version and fixes
the indirect case to work properly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 23:55:14 +01:00
Samuel Pitoiset
4f395e28a7 radv: return an error code when resetting a command buffer
If the upload BO allocation failed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-11 21:57:02 +02:00
Samuel Pitoiset
03542d1663 radv: remove unnecessary goto in radv_create_cmd_buffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-11 21:57:01 +02:00
Samuel Pitoiset
fcab014f7d radv: do not pass a pipeline object to radv_emit_graphics_pipeline()
To be consistent with radv_emit_compute_pipeline().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-11 21:56:58 +02:00
Dave Airlie
310fca375c radv: add debug flags to zero vram allocations.
We are seeing apps that sometimes rely on Windows behaviour, add
a flag to rule out vram zeroing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-12 05:37:15 +10:00
Marek Olšák
6eade342eb radeonsi: optimize TCS epilog when invocation 0 writes tess factors
This removes the barrier and LDS stores and loads for tess factors
when it's possible. The removal of the barrier seems more important
to me though.

In one shader, it removes 17 * 4 bytes from the shader binary.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 19:02:02 +02:00
Marek Olšák
386d165d8d tgsi/scan: add a new pass that analyzes tess factor writes (v2)
The pass tries to deduce whether tess factors are always written by
all shader invocations.

The implication for radeonsi is that it doesn't have to use a barrier
near the end of TCS, and doesn't have to use LDS for passing the tess
factors to the epilog.

v2: Handle barriers and do the analysis pass for each code segment
    surrounded by barriers separately, and AND results from all
    such segments writing tess factors. The change is trivial in the main
    switch statement.

    Also, the result is renamed to "tessfactors_are_def_in_all_invocs"
    to make the name accurate.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 19:02:02 +02:00
Anuj Phogat
b2dae9f8fd intel: Remove unused Kabylake pci id
I missed this one in Mesa commit ebc5ccf.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-11 08:45:43 -07:00
Rob Herring
0ba2d26525 Android: Add LLVM support for Android P
The Android version in AOSP master has changed now to P, so we need to add
LLVM flags for it. Duplicating the lines because I expect the version will
get bumped at some point and diverge from O.

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-09-11 09:35:23 -05:00
Chih-Wei Huang
af726a1e2c Android: fix undeclared identifier 'gfx9d_reg_table'
Since commit 552aaa11 the compiler complains:

external/mesa/src/amd/common/ac_debug.c:124:51: error: use of undeclared identifier 'gfx9d_reg_table'; did you mean 'sid_reg_table'?
                reg = find_register(gfx9d_reg_table, ARRAY_SIZE(gfx9d_reg_table), offset);
                                                                ^~~~~~~~~~~~~~~
                                                                sid_reg_table

It's because the commit ef97cc0c ("radeonsi/gfx9: add IB parser support")
add gfx9d.h as a recipe of sid_tables.h. But the corresponding Android.mk
was not updated. However, it's not spotted since gfx9d_reg_table is not
really used until commit 552aaa11 was landed.

Fixes: 552aaa11 (ac/debug: take ASIC generation into account when printing registers)

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-09-11 09:35:23 -05:00
Marek Olšák
a2a326e8f8 winsys/amdgpu: use the new raw CS API
This also cleans things up.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
3824ca7610 radeonsi: implement pipe_context::fence_server_sync
This will be more useful once we have sync_file support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
8843bf6dfd winsys/amdgpu: factor out some fence dependency code into separate functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
a6eb164eb2 winsys/amdgpu: rename fence_dependency functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
fc45495474 gallium/radeon: add a proper fail path for calloc in r600_flush_from_st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
7213293fe2 winsys/amdgpu: don't allow interprocess resource sharing for IBs
Now we should get IB submissions with bo_list == NULL when DRI buffers
aren't referenced.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Marek Olšák
46e7478986 radeonsi/gfx9: fix interprocess resource sharing on Raven
This kinda fragiile, but it at least unbreaks the driver.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 16:29:52 +02:00
Nicolai Hähnle
03203b7448 st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D index
Don't get distracted by record dereferences between array references.

Fixes dEQP-GLES31.functional.tessellation.user_defined_io.per_vertex_block.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-11 15:03:57 +02:00
Samuel Iglesias Gonsálvez
5b1b088f2a nir/spirv: fix chain access with different index bit sizes
Currently we support 32-bit indexes/offsets all over the driver, so we
convert them to that bit size.

Fixes dEQP-VK.spirv_assembly.instruction.*.indexing.*

v2: Use u2u32 instead (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-11 10:03:39 +02:00
Dave Airlie
8d6b97a815 r600: handle the non-TXF_LZ support path.
it appears that texcoord.z/w will be 0 in all cases already,
so just put them into the vbo always.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-11 02:10:24 +02:00
Marek Olšák
c1d92f8222 gallium/u_blitter: use UTIL_BLITTER_ATTRIB_NONE (0) instead of 0 directly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-09-11 02:10:24 +02:00
Marek Olšák
005fa89bfa gallium/u_blitter: don't pass GENERIC in VS if it's not needed
Now, depth-only clears and custom passes don't read memory in VS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-09-11 02:10:24 +02:00
Marek Olšák
22ed1ba01a gallium/u_blitter: use draw_rectangle for all blits except cubemaps
Add ZW coordinates to the draw_rectangle callback and use it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-09-11 02:10:24 +02:00
Marek Olšák
43247c440e gallium/u_blitter: use draw_rectangle callback for layered clears
They are done with instancing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-09-11 02:10:23 +02:00
Marek Olšák
7aaf4c73de gallium/u_blitter: add new union blitter_attrib to replace pipe_color_union
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-09-11 02:10:23 +02:00
Marek Olšák
e4c457f695 gallium/radeon: use rectangles for 1D and 2D texture blits
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-11 02:10:23 +02:00
Eric Engestrom
ce7164252e i965/tex: add missing include
src/mesa/drivers/dri/i965/intel_tex.h:52:40: warning: ‘enum intel_miptree_create_flags’ declared inside parameter list will not be visible outside of this definition or declaration
                 enum intel_miptree_create_flags flags);
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: cadcd89278 "i965/tex: Change the flags type on
                             create_for_teximage"
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-10 13:14:06 +01:00
Bas Nieuwenhuizen
e3c9425158 radv: Actually check for vm faults.
The code can check for vm faults having happened. If we only do it
on a hang we don't know when the faults happened. This changes the
behavior to when the first VM faults is found, even without a hang.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-09 11:50:30 +02:00
Roland Scheidegger
57a341b0a9 llvmpipe, draw: improve shader cache debugging
With GALLIVM_DEBUG=perf set, output the relevant stats for shader cache usage
whenever we have to evict shader variants.
Also add some output when shaders are deleted (but not with the perf setting
to keep this one less noisy).
While here, also don't delete that many shaders when we have to evict. For fs,
there's potentially some cost if we have to evict due to the required flush,
however certainly shader recompiles have a high cost too so I don't think
evicting one quarter of the cache size makes sense (and, if we're evicting
based on IR count, we probably typically evict only very few or just one
shader too). For vs, I'm not sure it even makes sense to evict more than
one shader at a time, but keep the logic the same for now.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-09 03:06:10 +02:00
Roland Scheidegger
772f475351 llvmpipe: enable PIPE_CAP_QUERY_PIPELINE_STATISTICS
This was implemented since forever, but not enabled.
It passes all piglit tests except one, arb_pipeline_statistics_query-frag.
The reason is that the test (for drawing a 10x10 rect) expects between
100 and 150 pixel shader invocations. But since llvmpipe counts this with
4x4 granularity (and due to the rect being 2 tris) we end up with 224
invocations. I believe however what llvmpipe is doing violates neither the
spirit nor the letter of the spec (our fragment shader granularity really
is 4x4 pixels, albeit we will bail out early on 2x2 or 4x2 (the latter
if AVX is available) granularity), the spec allows to count additional
invocations due to implementation reasons.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-09 03:06:10 +02:00
Roland Scheidegger
dcf2feadc3 gallivm: fix gather implementation a bit
gather is defined in terms of bilinear filtering, just without the filtering
part. However, there's actually some subtle differences required in our
implementation, because we use some tricks to simplify coord wrapping for the
two coords per direction.
For bilinear filtering, we don't care if we end up with an incorrect
texel, as long as the filter weight is 0.0 for it. Likewise, the order of
the texels doesn't actually matter (as long as they still have the correct
filter weight).
But for gather, these tricks lead to incorrect results.
Fix this for CLAMP_TO_EDGE, and add some comments to the other wrap functions
which look broken (the 3 mirror_clamp plus mirror_repeat) (too complex to fix
right now, and noone really seems to care...).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-09 03:06:10 +02:00
Andres Gomez
30682fba77 docs: update calendar, add news item and link release notes for 17.1.9
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-09-09 02:15:41 +03:00
Andres Gomez
97dce9e278 docs: add sha256 checksums for 17.1.9
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-09-09 02:11:21 +03:00
Andres Gomez
1e1131782c docs: add release notes for 17.1.9
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-09-09 02:11:19 +03:00
Brian Paul
832990c0ce mesa: whitespace, formatting fixes in teximage.c
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-09-08 13:58:51 -06:00
Brian Paul
33c55e8a9d mesa: provide more info in some texture image error messages
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-09-08 13:58:40 -06:00
Charmaine Lee
57d9222ef2 svga: abort shader translation upon indirect indexing of temporaries
This patch aborts shader translation upon indirect indexing of temporary
register on non-vgpu10 device. This prevents non-supported feature
sending to the device.

Tested wth MTT-piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-08 13:58:38 -06:00
Samuel Pitoiset
885d75760b radv: keep track of the disasm string in debug mode only
This will allow to dump the active shaders when a hang is
detected. Only the ASM will be dumped for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 17:18:17 +02:00
Samuel Pitoiset
92db23f3f9 radv: add shader_variant_create() helper function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 17:17:40 +02:00
Samuel Pitoiset
47efc5264a radv: drop 'dump' parameters from some shader related functions
The device object contains the debug flags.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 17:17:40 +02:00
Samuel Pitoiset
d4d777317b radv: move shaders related code to radv_shader.c
Reduce size of radv_pipeline.c and improve code isolation. More
code can probably moved but it's a start.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 17:17:40 +02:00
Samuel Pitoiset
988d792375 radv: fix error code when initializing the push descriptors
malloc() failures are unrelated to the device memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 16:15:21 +02:00
Samuel Pitoiset
67ee31a086 radv: do not update vertex descriptors if the allocation failed
A return code error is stored in the command buffer and should
be returned to the user via EndCommandBuffer().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 16:04:51 +02:00
Samuel Pitoiset
fefbcb090d radv: add radv_vertex_elements_info data structure
In my opinion, this improves code readability.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 16:04:51 +02:00
Eric Engestrom
f77d06fb28 gallium/tests: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-08 10:29:40 +01:00
Eric Engestrom
db8c5ae853 r300: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-08 10:29:40 +01:00
Eric Engestrom
440ab62341 glx: use ARRAY_SIZE macro
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
2017-09-08 10:29:40 +01:00
Samuel Pitoiset
b33b85cdd4 radv: add an assertion when pushing meta descriptor sets
Just to make sure we are using the set 0, because it's the
only one which is saved/restored when doing meta operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-08 09:42:23 +02:00
Thomas Hellstrom
6e2b87c7e9 mesa/st: Fix frontbuffer rendering regression
This fixes a regression introduced with commit
"mesa/st: Reduce the number of frontbuffer flush calls"
where we, after flushing the front buffer marked it as not-rendered-to,
the idea being that it should be marked as "rendered-to" again as soon as
any rendering was touching the front.

Now the latter part never happened, because it was part of a state
validation and we never marked that part of the state as dirty.

So mark the framebuffer state dirty after a frontbuffer flush.
(fdo bugzilla 102496)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102496
Fixes: eceb671002 (mesa/st: Reduce the number of frontbuffer flush calls)
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
2017-09-08 09:26:18 +02:00
Kenneth Graunke
44ac54a3fd i965: Don't special case the batchbuffer when reference counting.
We don't need to special case the batch - when we add the batch to the
validation list, we can simply increase the refcount to 2, and when we
make a new batch, we'll drop it back down to 1 (when unreferencing all
buffers in the validation list).  The final reference is still held by
brw->batch.bo, as it was before.

This removes the special case from a bunch of loops.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-08 00:21:09 -07:00
Connor Abbott
b909d278d0 ac: remove bitcast_to_float()
ac_to_float() does a superset of what it does.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:24:56 +01:00
Connor Abbott
50967cd0b0 ac: move ac_to_integer() and ac_to_float() to ac_llvm_build.c
We'll need to use ac_to_integer() for other stuff in ac_llvm_build.c.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:24:02 +01:00
Connor Abbott
fafa299511 ac: fix ac_get_type_size() for doubles
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:19:47 +01:00
Dave Airlie
4cab214e76 radv/ac: use ac_get_type_size.
Just moved to newly shared code.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:15:50 +01:00
Connor Abbott
b8a51c8c4b radeonsi: move the guts of ARB_shader_group_vote emission to ac
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:12:49 +01:00
Connor Abbott
bd73b89792 radeonsi: move si_emit_ballot() to ac
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:12:42 +01:00
Connor Abbott
ac27fa7294 radeonsi: move emit_optimization_barrier() to ac
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:06:47 +01:00
Connor Abbott
c181d4f2b7 radeonsi: move llvm_get_type_size() to ac
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 04:04:16 +01:00
Dave Airlie
a5add6fb30 radv/winsys: fix flags vs va_flags thinko.
Fixes: e8d57802f (radv/gfx9: allocate events from uncached VA space)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-08 12:30:23 +10:00
Dave Airlie
219d29e4d8 radv: use simpler indirect packet 3 if possible.
This fixes some observed hangs on CIK GPUs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 21:05:16 +01:00
Dave Airlie
e8d57802fe radv/gfx9: allocate events from uncached VA space
This copies what amdgpu-pro does, and allocates the memory
for an event with an uncached mtype.

This fixes hangs with:
dEQP-VK.api.command_buffers.record_simul_use_primary

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 21:04:54 +01:00
Dave Airlie
76ac8fafad radv/winsys: use amdgpu_bo_va_op_raw.
This is a precursor to the gfx9 fix to use uncached for the event
memory. Move to the interface which allows setting the flags,
but wrap it to avoid having to copy it around the place.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 21:04:24 +01:00
Leo Liu
6e8ef53837 Revert "st/va: add enviromental variable to disable interlace"
This reverts commit 10dec2de2d.

The environment variable is no longer needed with the previous change

Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Leo Liu
15d4d44d9b st/va: move YUV content to deinterlaced buffer when reallocated for encoder
v2: use deinterlace common function
v3: make sure deinterlace only

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Leo Liu
cadeb73f6b st/va: reallocate the buffer if the layout isn't supported
So that it makes more clear for buffer reallocation based
on buffers layout for both decoder and encoder.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Leo Liu
78ec7400c5 vl/compositor: make vl_compositor_set_yuv_layer() static
Since it's no longer being called outside of compositor

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Leo Liu
9f32078c20 st/omx: use vl/compositor helper function for YUV deinterlacing
v2: separate helper function in different patch

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Leo Liu
a6da7e6c3a vl/compositor: make a helper function for YUV deinterlacing
The similar function is in OMX, and only used by OMX. Now have it
moved to vl/compositor for other state tracker to use later.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-07 13:32:36 -04:00
Marek Olšák
4bd2bdbb3c ac/surface: add radeon_surf::has_stencil for convenience
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 17:59:37 +02:00
Gert Wollny
c4741bbb6f mesa/st/tests: Fix regressions with libunwind enabled introduced with 7be6d8fe12
Add the according flags to link with libunwind.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102565
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 14:14:02 +02:00
Gert Wollny
ab16538b83 mesa/st/tests: Fix classic build regressions introduced with 7be6d8fe12
Fixes the build in classic only mode, i.e. the new state tracker tests are
only build when Gallium is enabled.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 14:13:43 +02:00
Iago Toral Quiroga
580fc06c61 mesa/main: Fix GetTransformFeedbacki64 for glTransformFeedbackBufferBase
The spec has special rules for querying buffer offsets and sizes
when BindBufferBase is used, described  in the OpenGL 4.6 spec,
section 6.8 Buffer Object State:

   "To query the starting offset or size of the range of a buffer
    object binding in an indexed array, call GetInteger64i_v with
    target set to respectively the starting offset or binding size
    name from table 6.5 for that array. Index must be in the range
    zero to the number of bind points supported minus one. If the
    starting offset or size was not specified when the buffer object
    was bound (e.g. if it was bound with BindBufferBase), or if no
    buffer object is bound to the target array at index, zero is
    returned."

Transform feedback buffer queries should follow the same rules, since
it is the same case for them. There is a CTS test for this.

Fixes:
KHR-GL45.direct_state_access.xfb_buffers

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-09-07 14:01:15 +02:00
Marek Olšák
7ec64bd88c radeonsi: don't read tcs_out_lds_layout.patch_stride from an SGPR
Same as before, writing TCS outputs to LDS is rare.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
07fe10c75d radeonsi: don't read tcs_out_lds_layout.vertex_size from an SGPR
TCS outputs are usually not written to LDS, so no stats here.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
89bf8668c2 radeonsi/gfx9: don't read LS out vertex stride from an SGPR in monolithic HS
-44 bytes in a monolithic LS-HS binary.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
f974bb768b radeonsi: don't read the LS output vertex stride from an SGPR in LS
Now it's able to generate ds_write2_b64 instead of ds_write2_b32.

-20 bytes in one shader binary. (having only 1 output)

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
22f5dfd300 radeonsi: don't read the number of TCS out vertices from an SGPR in TCS
-16 bytes in one shader binary.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
17dd4856a6 radeonsi: don't always apply the PrimID instancing bug workaround on SI
It looks like commit 391673af7a that should
have fixed the perf regression didn't really change much if anything.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:06 +02:00
Marek Olšák
a0823df148 radeonsi: remove 2 callbacks from si_shader_context
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:06 +02:00
Marek Olšák
1cda9a2fee winsys/amdgpu: disable local BOs on Raven
It hangs with a high degree of reproducibility.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 12:57:48 +02:00
Marek Olšák
7b4b8f6373 disk_cache: make the thread queue resizable and low priority
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 12:57:14 +02:00
Thomas Hellstrom
e96d175c7d loader/dri3: Make sure we invalidate a drawable on size change
If we're seeing a drawable size change, in particular after processing a
configure notify event, make sure we invalidate so that the state tracker
picks up the new geometry.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 12:43:29 +02:00
Thomas Hellstrom
a727c804a2 loader/dri3: Process event after each fence wait
This tries to mimic dri2 behaviour where events are typically processed
while waiting for X replies. Since, during steady-state dri3 rendering, we
seldom wait for xcb replies, and haven't enabled any automatic event
processing, instead check for events after a fence wait.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 12:43:29 +02:00
Marek Olšák
e4018fdd85 st/mesa: skip draw calls with pipe_draw_info::count == 0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-07 12:34:28 +02:00
Eric Engestrom
6c2e0527ea docs: update envvar docs to reflect MESA_NO_ERROR change
I changed the behaviour earlier today, but forgot to update the
corresponding docs.

Fixes: 77713a0acb "mesa: allow user to set MESA_NO_ERROR=0"
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-07 11:16:31 +01:00
Samuel Pitoiset
86b99893eb radv: do not use a bitfield when dirtying the vertex buffers
Useless to track which one has been updated because we
re-upload all the vertex buffers in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-07 10:01:21 +02:00
Samuel Pitoiset
2408f616e8 radv: remove unused radv_meta_saved_state::vertex_saved field
It's always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-07 10:01:21 +02:00
Eric Engestrom
77713a0acb mesa: allow user to set MESA_NO_ERROR=0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102530
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 08:54:44 +01:00
Eric Engestrom
56f16c4fbb util: rename include guard to avoid clash
src/mesa/main/debug.h uses the same include guard.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 08:54:44 +01:00
Roland Scheidegger
6d9d6071ee llvmpipe, tgsi: hook up dx10 gather4 opcode
Trivial. We already support tg4 for legacy tex opcodes, so the actual
texture sampling code already handles it.
(Just like TG4, we don't handle additional capabilities and always sample
red channel.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-07 03:32:01 +02:00
Roland Scheidegger
de6810d9be llvmpipe, draw: increase shader cache limits
We're not particularly concerned with memory usage, if the tradeoff is
shader recompiles. And it's common for apps to have a lot of shaders
nowadays (and, since our shaders include a LOT of context state of course
we may create quite a bit more shaders even).
So quadruple the amount of shaders draw will cache (from 128 to 512).
For llvmpipe (fs shaders) quadruple the number of instructions, keep the
number of variants the same for now (only with very simple, non-texturing
shaders the variant limit could really be reached), and simplify the
definition, it's probably easier to just have one different definition
per branch...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-07 03:32:01 +02:00
Dave Airlie
e852ecd22b ac/surface: reduce gfx9_surface_layout size.
152->144.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
cc73ab9884 radv: reduce radv_amdgpu_winsys struct size.
1168->1160.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
3cc620bf55 radv: reduce radv_image struct size.
1480->1472.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
66031d8925 radv: reduce radv_shader_variant struct size.
544->536

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
a2c2a76c9e radv: reduce radv_cmd_state struct size.
1632->1624.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
f45e768413 radv: reduce meta_saved_state struct size.
904->896.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:07 +10:00
Dave Airlie
42d50c779b nir: put compact into bitfields in nir_variable_data
This being declared bool means it won't get merged with the previous
bitfields, this seems like an oversight rather than deliberate.

Noticed when running pahole.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:04 +10:00
Chad Versace
ec8ed2f277 anv: Annotate entrypoint table with index and func name
This helps when debugging a broken entrypoint table.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-06 13:07:12 -07:00
Leo Liu
e1e3c0384b radeon/uvd: fix the assertion check for YUYV format
Fixes:7319ff87("radeon/uvd: add YUYV format support for target buffer")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-06 15:53:18 -04:00
Anuj Phogat
ad160c2273 intel: Add brand string for KBL-R
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:09:44 -07:00
Anuj Phogat
4c4c28ca70 intel: Remove unused device info for KBL GT1.5
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:09:38 -07:00
Anuj Phogat
9c588ffdfb intel: Change a KBL pci id to GT2 from GT1.5
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:09:34 -07:00
Anuj Phogat
a000fca415 intel: Fix few KBL brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:09:25 -07:00
Anuj Phogat
ebc5ccf3cc intel: Remove unused Kabylake pci ids
These PCI IDs are not used in any Kabylake SKUs.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:08:58 -07:00
Emil Velikov
d03b06b35e Revert "Android: add -Wno-date-time flag for clang"
This reverts commit 6dae9176d6.

No longer needed as of last commit.

Cc: Rob Herring <robh@kernel.org>
2017-09-06 17:48:51 +01:00
Emil Velikov
54a789aa2a mesa: replace date/time macros with MESA_GIT_SHA1
Former is non-deterministic, results in non-reproducible builds and
compilers throw a warning about it.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
acf7f84564 mesa: don't use %s for PACKAGE_VERSION macro
The macro itself is a well defined string, which cannot cause issues
with printf or other printf-like functions.

All other places through Mesa already use it directly, so let's update
the final two instances.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
d0a4b26915 docs/release-calendar: update and extend
v2: Correct 17.1.10 version, adjust some names.
v3: Add missing <tr> (Andres)

Cc: Juan A. Suárez <jasuarez@igalia.com>
Cc: Andres Gomez <agomez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2017-09-06 17:48:50 +01:00
Emil Velikov
cf6e6eb5cd docs/releasing: polish LLVM_CONFIG wording/handling
Use consistent way to manage "non-default" llvm installations, clearly
documenting it.

AKA, use LLVM_CONFIG throughout and unset for the Windows/mingw builds.

v2: unset the save_ variable (Andres)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2017-09-06 17:48:50 +01:00
Emil Velikov
0f24660245 docs/releasing: remove -jX instances
One can control the number of jobs via MAKEFLAGS. As such there's
little reason to set the number of jobs for each make invocation.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
368734d014 .gitignore: list *.orig and *.rej
Should prevent accidental check-in of patch artefacts.

Suggested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-09-06 17:48:50 +01:00
Emil Velikov
c9d449de64 egl/x11: advertise __DRI_USE_INVALIDATE for DRI2
Back in 2012 (commit 1e7776ca2b - egl: Remove bogus invalidate code.)
the loader use of invalidate() was purged as "bogus". One of the factors
defining that statement was the lack of the loader-side invalidate
extension - __DRI_USE_INVALIDATE.

Since then the commit was reverted (commit eed0a80137 - egl: Restore
"bogus" DRI2 invalidate event code.), always performing the driver
invalidate call, although the loader was never updated to expose the
extension.

Do so allowing the driver to do fine grained tuning.

Cc: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net
2017-09-06 17:48:50 +01:00
Emil Velikov
f24bc18162 egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension
Fixes: 3b7b6adf3a ("egl: Implement __DRI_BACKGROUND_CALLABLE")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
731ba6924a i965: expose RGBA visuals only on Android
As Marek pointed out in earlier commit - exposing RGBA on other
platforms introduces ~500 Visuals, which are not tested.

Note that this does not quite happen, yet. Reason being that the GLX
code does not check the masks - see scaralEqual().

Thus as we fix that, we'll run into the issue described.

v2: Rebase, while keeping loaderPrivate
v3: Beef-up commit message, getCapability() returns unsigned (Tapani)

Fixes: 1bf703e4ea ("dri_interface,egl,gallium: only expose RGBA visuals
on Android")
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Chad Versace <chadversary@chromium.org>
Cc: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-06 17:48:50 +01:00
Tim Rowley
dad32fc61c swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:36 -05:00
Tim Rowley
1ebf6fc865 swr/rast: Remove use of C++14 template variable
SWR rasterizer must remain C++11 compliant.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:29 -05:00
Tim Rowley
9df5691fff swr/rast: SIMD16 FE remove templated immediates workaround
Fixed properly in gcc-compatible fashion.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:23 -05:00
Tim Rowley
404ac6da9e swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble
For consistency and to support overloading.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:17 -05:00
Tim Rowley
6cb20c9f3a swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:12 -05:00
Tim Rowley
6afdc8732c swr/rast: Removed some trailing whitespace caught during review
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:06 -05:00
Tim Rowley
4edc5d8305 swr: set caps for VB 4-byte alignment
Needed to compensate for change to fetch jit requiring
alignment.

Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:59 -05:00
Tim Rowley
4475583f5e swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:39 -05:00
Samuel Pitoiset
5c9af800cb radv: fix error code when resizing the upload BO
malloc() failures are unrelated to the device memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-06 15:52:19 +02:00
Gert Wollny
107ecd97f1 mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVC
If <windows.h> is included then max is a macro that clashes
with std::numeric_limits::max, hence undefine it.
For some reason the struct access_record is not recognizes
outside the anonymouse namespace, make it a class.
The patch successfully was tested on AppVeyor.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 15:12:19 +02:00
Gert Wollny
09ffe274b0 mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
This patch replaces the old register lifetime estiamtion and
rename mapping evaluation with the new one.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread.

-----------------------------------------------------------
                    old          new(std::sort)

---------------- time ./run -j1 shaders --------------------

  real              5.80s          5.75s
  user              5.75s          5.70s
  sys               0.05s          0.05s

---- valgrind --tool=callgrind --dump-instr=yes------------

 merge               0.08%         0.18%
 estimate lifetime   0.02%         0.11%
 evaluate mapping  (incl=0.3%)     0.04%
 apply mapping       0.03%         0.02%

---   perf (approximate because of statistic sampling) ----

merge (total)        0.09%         0.16%
estimate lifetime    0.03%         0.10%
evaluate mapping  (incl=0.02%)     0.04%
apply mapping        0.04%         0.04%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:52 +02:00
Gert Wollny
33b7728bf9 mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
The patch adds tests for the register rename mapping evaluation and
combined life time estimation and renaming.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:49 +02:00
Gert Wollny
84529c077b mesa/st: glsl_to_tgsi: add register rename mapping evaluator
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
  src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.

Registers that are not written to are not considered for renaming since in
glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:46 +02:00
Gert Wollny
7be6d8fe12 mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker
This patch adds a set of unit tests for the new lifetime tracker.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:43 +02:00
Gert Wollny
978c437b12 mesa/st: glsl_to_tgsi: implement new temporary register lifetime tracker
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned  second time to record the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actual minimal life time is
evaluated.

In addition, when compiled in debug mode (i.e. NDEBUG is not defined)
the shaders and estimated temporary life times can be logged to stderr
by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:39 +02:00
Gert Wollny
732246701f mesa/st: glsl_to_tgsi move some helper classes to extra files
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

    class st_src_reg;
    class st_dst_reg;
    class glsl_to_tgsi_instruction;
    struct rename_reg_pair;

    int swizzle_for_type(const glsl_type *type, int component);

  as inline:

    bool is_resource_instruction(unsigned opcode);
    unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
    unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:27 +02:00
Dave Airlie
b65ff7a02d st_glsl_to_tgsi: rewrite rename registers to use array fully.
Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.

Removes this from the profile on some of the fp64 tests

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 11:44:16 +02:00
Nicolai Hähnle
45c5c44451 radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
When the HS wave is empty, the hardware writes the LS VGPRs starting at
v0 instead of v2. Workaround by shifting them back into place when
necessary. For simplicity, this is always done in the LS prolog.

According to the hardware team, this will be fixed in future chips,
so take that into account already.

Note that this is not a bug fix, as the bug was already worked
around by commit 166823bfd2 ("radeonsi/gfx9: add a temporary workaround
for a tessellation driver bug"). This change merely replaces the
workaround by one that should be better.

v2: add workaround code to shader only when necessary
v3: clarify the prefer_mono comment

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 10:02:49 +02:00
Nicolai Hähnle
552aaa11ed ac/debug: take ASIC generation into account when printing registers
There were some overlapping changes in gfx9 especially in the CB/DB
blocks which made register dumps rather misleading.

The split is along the lines of the header files, so we'll print VI-only
fields on SI and CI, for example, but we won't print GFX9 fields on
SI/CI/VI, and we won't print SI/CI/VI fields on GFX9.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:19 +02:00
Nicolai Hähnle
274f1dace7 amd/common: pass chip_class to ac_dump_reg
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:17 +02:00
Nicolai Hähnle
925ad7d2f6 ac/sid_tables: add FieldTable object
Automatically re-use table entries like StringTable and IntTable do.
This allows us to get rid of the "fields_owner" logic, and simplifies
the next change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:14 +02:00
Nicolai Hähnle
981335b704 ac/sid_tables: remove unused variable varname_values
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:07 +02:00
Nicolai Hähnle
34124e412f radeonsi/gfx9: always flush DB metadata on framebuffer changes
This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:57:08 +02:00
Nicolai Hähnle
1e247511e5 util/ralloc: set prev-pointers correctly in ralloc_adopt
Found by inspection.

I'm not aware of any actual failures caused by this, but a precise
sequence of ralloc_adopt and ralloc_free should be able to cause
problems.

v2: make the code slightly clearer (Eric)

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 09:56:19 +02:00
Iago Toral Quiroga
94f740e3fc mesa/main: Fix GetTextureImage error reporting
GetTex*Image should return INVALID_ENUM if target is not valid, however,
GetTextureImage does not receive a target, and instead should return
INVALID_OPERATION if the effective target is not valid. From the
OpenGL 4.6 core profile spec, section 8.11 Texture Queries:

"An INVALID_OPERATION error is generated by GetTextureImage if the effective
 target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY,
 TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or
 TEXTURE_CUBE_MAP (for GetTextureImage only)."

Fixes:
KHR-GL45.direct_state_access.textures_image_query_errors

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-09-06 08:19:53 +02:00
Tapani Pälli
c77ea0501c egl: remove unused 'Screens' array from _egl_display
This was used by EGL_MESA_screen_surface that has been removed
in commit 7a58262e58.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <eml.velikov@collabora.com>
2017-09-06 07:59:14 +03:00
Dave Airlie
e38685cc62 Revert "radv: disable support for VEGA for now."
This reverts commit 611076a41a.

With the two previous commits, vega shouldn't be unstable,
doesn't pass CTS, but can do a complete run, and games shouldn't
hang anymore, so bring it back online.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:23:10 +01:00
Dave Airlie
6d929d3f85 radv/gfx9: set descriptor up for base_mip to level range.
This is required on GFX9, fixes a bug in Talos where all the
mipmaps overlay each other.

Just pushing this as well as it fixes Talos.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:22:22 +01:00
Dave Airlie
d118ff8765 radv: disable 1d/2d linear optimisation on gfx9.
This causes hangs in some of the CTS tests with a 2d
1536x2 texture.

This fixes hangs with:
dEQP-VK.pipeline.image.suballocation.sampling_type.combined.iew_type.1d_aray.format.r4g4b4a4_unorm_pack16.count_1.size.512x1_array_of_3
if we reenable it, make sure these don't regress.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:06:08 +01:00
Dave Airlie
b880cd3b59 radv/gfx9: fix buffer size on gfx9.
The VI sizing only applies to VI.

This fixes:
dEQP-VK.image.image_size.buffer.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 03:05:44 +01:00
Bas Nieuwenhuizen
ff23e03d60 radv: Fix vkCopyImage with both depth and stencil aspects.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-06 01:54:37 +02:00
Dave Airlie
9e6b382142 mesa/mtypes: repack gl_sampler_object.
160->152.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:14:25 +10:00
Dave Airlie
ff6123925c mesa/mtypes: repack gl_texture_object.
reduces size from 1144 to 1128.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:52 +10:00
Dave Airlie
ef660abdd5 mesa/mtypes: repack gl_shader_program_data.
This reduces the size from 144 bytes to 128 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:22 +10:00
Dave Airlie
449ac347dd mesa/mtypes: reorganise gl_shader
This reduces this from 200->182 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:13:03 +10:00
Dave Airlie
a53c63e46b mesa/mtypes: repack display list structs.
This reduces each of these by 8 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:53 +10:00
Dave Airlie
a265ffa69f mesa/mtypes: reduce size of gl_sync_object.
Drops from 40->32 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:47 +10:00
Dave Airlie
e4bcbe03b5 mesa/mtypes: reorg vertex/fragment program state.
reduces both of these by 8 bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:44 +10:00
Dave Airlie
cff02d214f mesa/bindless: reorder gl_bindless_image gl_bindless_sampler.
This makes these use 16-bytes instead of 24-bytes.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 06:12:12 +10:00
Samuel Pitoiset
7f952eb931 radv: fix a memleak when compiling the GS copy shader
Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-05 21:36:44 +02:00
Charmaine Lee
c12ef63b69 svga: move index buffer bind flag assertion
The buffer bind flags can be promoted in svga_buffer_handle(), so
move the assertion after it. This has already been done for
vertex buffer in commit 6b4bf7e8be, but it misses the one for
index buffer.

Fixes assertion running WarThunder.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-09-05 10:31:18 -06:00
Charmaine Lee
98badd7f6e svga: avoid emitting redundant SetShaderResources and SetVertexBuffers
Minor performance improvement in avoiding binding the same shader resource
or the same vertex buffer for the same slot.

Tested with MTT glretrace.

v2: Per Brian's suggestion, add a helper function to do vertex buffer
    comparision.
v3: Change the helper function to vertex_buffers_equal().

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-09-05 10:31:18 -06:00
Jason Ekstrand
e439908af9 spirv: Add support for the HelperInvocation builtin
I have no idea how this got missed but it's been missing since forever.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-09-05 16:39:24 +03:00
Thomas Hellstrom
86df05eb26 loader/dri3: Use client local back to front blit in copySubBuffer if available
The copySubBuffer functionality always attempted a server side blit from
back to fake front if a fake front was present, and we weren't displaying
on a remote GPU.

Now that we always have local blit capability on modern drivers, first
attempt a local blit, and only if that fails, try the server blit.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Axel Davy <axel.davy@normalesup.org>
2017-09-05 12:22:17 +02:00
Marek Olšák
c3ebac6890 radeonsi/gfx9: implement primitive binning
This increases performance, but it was tuned for Raven, not Vega.
We don't know yet how Vega will perform, hopefully not worse.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Marek Olšák
51e10c2770 radeonsi: add more state flags into si_state_dsa
3 flags for primitive binning, 2 flags for out-of-order rasterization
(but that will be done some other time)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Marek Olšák
0797eea758 radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-05 12:09:02 +02:00
Tapani Pälli
0986f68632 vbo: fix build errors on android
incompatible pointer to integer conversion assigning to 'GLintptr' (aka 'int')
from 'const char *' [-Werror,-Wint-conversion]

      offset = indices;
             ^ ~~~~~~~

Fixes: 2d93b462b4 ("vbo: fix offset in minmax cache key")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-05 07:55:34 +03:00
Emil Velikov
bddf4a51c1 docs: add news item and link release notes for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-09-04 18:26:34 +01:00
Emil Velikov
cd48ffc755 docs: add sha256 checksums for 17.2.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b4473dd519)
2017-09-04 18:24:52 +01:00
Emil Velikov
f60fe7a448 docs: Update 17.2.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f5925b2897)
2017-09-04 18:24:51 +01:00
Marek Olšák
fb7ba68f6c radeonsi: eliminate PS color outputs when colormask kills them
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-04 15:10:39 +02:00
Marek Olšák
468c131033 gallium/radeon: sort DBG shader flags according to pipe_shader_type
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-04 15:10:39 +02:00
Nicolai Hähnle
50283109aa radeonsi: ensure cache flushes happen before SET_PREDICATION packets
The data is read when the render_cond_atom is emitted, so we must
delay emitting the atom until after the flush.

Fixes: 0fe0320dc0 ("radeonsi: use optimal packet order when doing a pipeline sync")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-04 13:50:57 +02:00
Nicolai Hähnle
097cfe9fde radeonsi: fix ARB_transform_feedback_overflow_query on <= VI
The result written by the shader workaround needs to be written back, or
the CP may read stale data.

Fixes: 78476cfe07 ("radeonsi: enable ARB_transform_feedback_overflow_query")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-04 13:50:54 +02:00
Nicolai Hähnle
55df3d2286 radeonsi: fix compute shader state dumping
Fixes: 420c438589 ("radeonsi: log draw and compute state into log context")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-04 13:50:47 +02:00
Nicolai Hähnle
30a2f0dfd4 radeonsi: add an assertion that only two-dimensional constant references are used
v2: remove some redundant checks

Acked-by: Roland Scheidegger <sroland@vmware.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:44:09 +02:00
Nicolai Hähnle
3e4dff4f00 gallium/radeon: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:44:06 +02:00
Nicolai Hähnle
83923a1f17 gallium/tests: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:44:04 +02:00
Nicolai Hähnle
33661190d2 pp: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:44:01 +02:00
Nicolai Hähnle
41fba40776 gallium/hud: always use two-dimensional constant references
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:59 +02:00
Nicolai Hähnle
f143354d06 nine: always generate two-dimensional constant file accesses
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:56 +02:00
Nicolai Hähnle
d14f7f7210 st/glsl_to_tgsi: inline src_register into translate_src
src_register has no meaningful standalone use, it only makes sense when
called from translate_src.

v2: fix input array handling

Acked-by: Roland Scheidegger <sroland@vmware.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:52 +02:00
Nicolai Hähnle
42b444ca18 st/glsl_to_tgsi: ir_load_ubo always has a second index
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:49 +02:00
Nicolai Hähnle
1424163798 st/drawpixels: always use two-dimensional constant references
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:46 +02:00
Nicolai Hähnle
a852ae3620 tgsi/build: always generate two-dimensional constant file accesses
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:44 +02:00
Nicolai Hähnle
41e342d548 tgsi/ureg: always emit constants (and their decls) as 2D
Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:40 +02:00
Nicolai Hähnle
37dd8e8dee gallium: all drivers should accept two-dimensional constant buffer indexing
Most older drivers seem to just ignore the Dimension setting, so virtually
no changes should be needed.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-04 13:43:36 +02:00
Eric Engestrom
0c7272a66c anv: fix off by one in array check
`anv_formats[ARRAY_SIZE(anv_formats)]` is already one too far.
Spotted by Coverity.

CovID: 1417259
Fixes: 242211933a "anv/formats: Nicely handle unknown VkFormat enums"
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-09-04 08:05:36 +01:00
Dave Airlie
979be4f9c8 ac: reorg ac_shader_binary struct to take less space.
This reduces the size from 96 to 80 bytes but putting all the
32-bit sizes at the start.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-04 08:40:37 +10:00
Dave Airlie
2b79bbde89 radv: drop emit2d_dst_type.
This is completely unused now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuien.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-04 08:40:27 +10:00
Xavier Bouchoux
bf8637addf radv/meta: missing initialisations in create_pass().
Otherwise radv_cmd_state_setup_attachments() will complain it has no clearvalues,
when called via radv_process_depth_image_inplace().

v2: use LOAD/STORE instead of DONT_CARE, to preserve stencil values.

Signed-off-by: Xavier Bouchoux <xavierb@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-04 00:26:39 +02:00
Bas Nieuwenhuizen
45e68ed065 radv: Enable command buffer chaining by default.
For approx 5-10% performance improvement in dota2.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-04 00:06:40 +02:00
Bas Nieuwenhuizen
1a72ca5667 radv: Put semaphore waits in preamble cs.
The separate flush cs gets in the way of batchchain.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-04 00:06:40 +02:00
Bas Nieuwenhuizen
dec7b38fe6 radv: Actually set the cmd_buffer usage_flags.
Otherwise, the simultaneous uage bit doesn't get set from the begin
info, which we need for batchchaining.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-09-04 00:06:40 +02:00
Eric Engestrom
49b428470e util: improve compiler guard
Glibc 2.26 has dropped xlocale.h, but the functions needed (strtod_l()
and strdof_l()) can be found in stdlib.h.
Improve the detection method to allow newer builds to still make use of
the locale-setting.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102454
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Laurent Carlier <lordheavym@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-03 09:05:23 +01:00
Leo Liu
8514c5d078 radeon/uvd: add Define Restart Interval to MJPEG bitstream reconstruction
It adds the capacity to decode MJPEG stream with DRI marker

Signed-off-by: Leo Liu <leo.liu@amd.com>
2017-09-02 21:33:11 -04:00
Leo Liu
3b02a8e9dd radeon/uvd: fix MJPEG quantization table index
Fixes: 130d1f456b ("radeon/uvd: reconstruct MJPEG bitstream")

Signed-off-by: Leo Liu <leo.liu@amd.com>
2017-09-02 21:33:11 -04:00
Roland Scheidegger
2b2c61f0df st/mesa: fix view template initialization in try_pbo_readpixels
I think this is what the code was meant to do, albeit as far as I can tell
the redundant initialization some analyzers complain about should work as
well just fine (only the first layer will be used, if the view contains one
or more layers doesn't really matter).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102467
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-09-03 03:31:28 +02:00
Kenneth Graunke
23b7c7a630 genxml: Make Border Color Pointer an address on Gen4-5, not an offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-02 12:56:18 -07:00
Kenneth Graunke
b8cd8a7545 i965: Inline emit_reloc in __genx_combine_address
One less layer of baklava.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-02 12:56:18 -07:00
Kenneth Graunke
52b65dfda8 i965: Fix crash in fallback GTT mapping.
We can't perf_debug without a context.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-09-02 12:56:18 -07:00
Kenneth Graunke
e5654fc450 i965: Fix state flagging of Gen6 SOL programs.
It doesn't seem like the old code could possibly work.

1. brw_gs_state_dirty made us bail unless one of these flags were set:
   _NEW_TEXTURE, BRW_NEW_GEOMETRY_PROGRAM, BRW_NEW_TRANSFORM_FEEDBACK
2. If there was no geometry program, we called brw_upload_ff_gs_prog()3
3. That checked brw_ff_gs_state_dirty and bailed unless these were set:
   _NEW_LIGHT, BRW_NEW_PRIMITIVE, BRW_NEW_TRANSFORM_FEEDBACK,
   BRW_NEW_VS_PROG_DATA.
4. brw_ff_gs_prog_key pv_first and attr fields were set based on data
   depending on _NEW_LIGHT and BRW_NEW_VS_PROG_DATA.

This means that if we needed a FF GS program, and changed the VS
outputs or provoking vertex mode, we'd fail to notice that we needed
to emit a new program.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-02 12:56:18 -07:00
Kenneth Graunke
4ddbc0a071 i965: Drop useless gen6_brw_upload_ff_gs_prog() wrapper.
gen6...brw?  Drop some baklava layers.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-09-02 12:56:18 -07:00
Rob Clark
dc9e08b0c3 freedreno: skip batch-cache for compute shaders
It is kind of pointless for compute, and avoids issues with apps kicking
off more than 32 compute shaders at once.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-09-02 11:41:20 -04:00
Vinson Lee
39a69f0692 m4: Use older autoconf 2.63 compatible ax_check_compile_flag.
CentOS 6 and RHEL 6 have autoconf 2.63.

Fixes: e4b2b69e82 ("configure: Add and use AX_CHECK_COMPILE_FLAG")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-01 16:30:40 -07:00
Kenneth Graunke
01f29366e3 i965: Move BATCH_SZ define into intel_batchbuffer.c.
It's only used in one file.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:41 -07:00
Kenneth Graunke
5ae631c544 i965: Drop batch_size argument from brw_bufmgr_init().
This is dead code and hasn't been used in a long time.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:40 -07:00
Chris Wilson
598503e285 i965: Rename brw_bo::offset64 to gtt_offset.
We can drop the meaningless "64" suffix - libdrm_intel originally had
an "offset" field that was an "unsigned long" which was the wrong size,
and we couldn't remove/alter that field without breaking ABI, so we had
to add a uint64_t "offset64" field.

"gtt_offset" is also more descriptive than "offset".

(Patch originally written by Ken, but Chris suggested a better name and
supplied the giant comment making up the bulk of the patch, so I changed
the authorship to him.)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:39 -07:00
Kenneth Graunke
804f78feb4 i965: Drop the BRW_BATCH_STRUCT macro.
It's used in exactly one place these days, and not much simpler than
just calling intel_batchbuffer_data directly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:36 -07:00
Kenneth Graunke
6ec7bddb19 i965: Don't double count the batch in aperture_space.
intel_batchbuffer_reset calls add_exec_bo on the batch right away,
which adds in the batch BO size.

Fixes: 29ba502a4e ("i965: Use I915_EXEC_BATCH_FIRST when available.")

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-01 09:59:25 -07:00
Cherniak, Bruce
43145bbf09 swr: Report format max_samples=1 to maintain support for "fake" msaa.
Accompanying patch "st/mesa: only try to create 1x msaa surfaces for
'fake' msaa" requires driver to report max_samples=1 to enable "fake"
msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions()
and either could enable "fake" msaa.

This patch raises the swr default msaa_max_count from 0 to 1, so that
swr_is_format_supported will report max_samples=1.

Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a
pow2 value between 2 and 16.

This patch is necessary to prevent an OpenSWR regression resulting from
the st/mesa patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-09-01 11:23:16 -05:00
Eric Engestrom
4d6c23ee83 aubinator: remove duplicate initialisation
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-01 17:06:43 +01:00
Samuel Pitoiset
80177306d9 radv: report VM faults if detected
It's fairly simple for now, but this might be quite useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:36 +02:00
Samuel Pitoiset
12cbd9a13f radeonsi: move si_vm_fault_occured() to AMD common code
For radv, in order to report VM faults when detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:32 +02:00
Samuel Pitoiset
72d9ffc72c radv: add radv_check_gpu_hangs() helper function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:46:00 +02:00
Samuel Pitoiset
f14020c15f radv: disassemble SPIR-V binaries with RADV_DEBUG=spirv
This introduces a new separate option because the output can
be quite verbose. If spirv-dis is not found in the path, this
debug option is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:41:54 +02:00
Samuel Pitoiset
ad42e2abb8 radv: move RADV_TRACE_FILE functions to radv_debug.c
At the moment, debugging radv is not really easy because the
driver doesn't report enough information when it hangs. This
new file will be the main location for all debug tools.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:41:54 +02:00
Samuel Pitoiset
f1f2f00f6a radv: silent a compiler warning in radv_emit_framebuffer_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:38:52 +02:00
Samuel Pitoiset
962fda5b90 radv: compute correct maximum wave count per SIMD
Ported from RadeonSI (original patch by Marek).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-01 09:38:50 +02:00
Brian Paul
9eca7e0ddb st/mesa: only try to create 1x msaa surfaces for "fake" msaa drivers
For software drivers where we want "fake" msaa support for GL 3.x, we
treat 1 sample as being msaa.

For drivers with real msaa support, start format probing at 2x msaa.
For drivers with fake msaa support, start format probing at 1x msaa.

This also tweaks the MaxSamples code in st_init_extensions() so that
we use MaxSamples=1 for fake msaa.  This allows the format proble loops
to run at least one iteration.

This fixes a llvmpipe/VTK regression from commit 6839d33699.
And for drivers with fake msaa support, calls such as
glTexImage2DMultisample(samples=1) will now succeed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102125
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-31 22:09:57 -06:00
Tobias Klausmann
1c4e6d7ca8 nvc0/ir: propagate immediates to CALL input MOVs
On using builtin functions we have to move the input to registers $0 and $1, if
one of the input value is an immediate, we fail to propagate the immediate:

...
mov u32 $r477 0x00000003 (0)
...
mov u32 $r0 %r473 (0)
mov u32 $r1 $r477 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...

With this patch the immediate is propagated, potentially causing the first MOV
to be superfluous, which we'd remove in that case:

...

mov u32 $r0 %r473 (0)
mov u32 $r1 0x00000003 (0)
call abs BUILTIN:0 (0)
mov u32 %r495 $r1 (0)
...

Shaderdb stats:
total instructions in shared programs : 4893460 -> 4893324 (-0.00%)
total gprs used in shared programs    : 582972 -> 582881 (-0.02%)
total local used in shared programs   : 17960 -> 17960 (0.00%)

                local        gpr       inst      bytes
    helped           0          91         112         112
      hurt           0           0           0           0

v2:
 implement some changes proposed by imirkin, the manual deletion of the dead
 mov is necessary after ea22ac23e0 ("nvc0/ir: unlink values pre- and post-call
 to division function") as the potentially dead mov is unlinked properly,
 causing later passes to not notice the mov op at all and thus not cleaning it
 up. That makes up a big chunk of the regression the above commit caused.
 Keep the deletion of the op where it is, deleting it later unnecessarily blows
 up size of the change.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-31 22:58:06 -04:00
Karol Herbst
b672c3833b nvc0: write 0 to pipeline_statistics.cs_invocations
cs_invocations are currently unsupported, but leaving the field uninitialized
is even worse.

fixes on nvc0:
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_default_qo_values
 * KHR-GL45.pipeline_statistics_query_tests_ARB.functional_non_rendering_commands_do_not_affect_queries

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-31 22:57:22 -04:00
Ben Crocker
57c8ead0cd llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load
Fix loading of a 3x16 vector as a single 48-bit load
on big-endian systems (PPC64, S390).

Roland Scheidegger's commit e827d91756
plus Ray Strode's patch reduce pre-Roland Piglit failures from ~4000 to ~2000.  This patch fixes
three of the four regressions observed by Ray:

- draw-vertices
- draw-vertices-half-float
- draw-vertices-half-float_gles2

One regression remains:
- draw-vertices-2101010

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-01 01:20:07 +02:00
Ray Strode
75cb6e3617 gallivm: correct channel shift logic on big endian
lp_build_fetch_rgba_soa fetches a texel from a texture.
Part of that process involves first gathering the element
together from memory into a packed format, and then breaking
out the individual color channels into separate, parallel
arrays.

The code fails to account for endianess when reading the packed
values.

This commit attempts to correct the problem by reversing the order
the packed values are read on big endian systems.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100613
Cc: "17.2" "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-09-01 01:19:13 +02:00
Roland Scheidegger
c92fe8a8c5 util: only use SCHED_IDLE in pthread_setschedparam() when it's defined
Fixes build error when it's not.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-01 01:10:32 +02:00
Jason Ekstrand
242211933a anv/formats: Nicely handle unknown VkFormat enums
This fixes some crashes in the dEQP-VK.memory.requirements.core.* tests.
I'm not sure whether or not passing out-of-bound formats into the query
is supposed to be allowed but there's no harm in protecting ourselves
from it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/101956
Cc: mesa-stable@lists.freedesktop.org
2017-08-31 14:31:42 -07:00
Charmaine Lee
2d93b462b4 vbo: fix offset in minmax cache key
Instead of saving primitive offset in the minmax cache key,
save the actual buffer offset which is used in the cache lookup.

Fixes rendering artifact seen with GoogleEarth when run with
VMware driver.

v2: Per Brian's comment, initialize offset to avoid compiler warning.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-08-30 23:12:21 -07:00
Tapani Pälli
15b61dec94 anv: fix build errors on android
error: incompatible pointer to integer conversion initializing 'VkFence'
   (aka 'unsigned long long') with an expression of type 'void *' [-Werror,-Wint-conversion]

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-31 18:05:50 +03:00
Christian König
214b565bc2 winsys/amdgpu: set AMDGPU_GEM_CREATE_VM_ALWAYS_VALID if possible v2
When the kernel supports it set the local flag and
stop adding those BOs to the BO list.

Can probably be optimized much more.

v2: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-31 14:55:38 +02:00
Marek Olšák
8b3a257851 radeonsi: set a per-buffer flag that disables inter-process sharing (v4)
For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-31 14:55:21 +02:00
Kenneth Graunke
5ae2de81c8 i965: Use BLORP for buffer object stall avoidance blits instead of BLT.
Improves performance of GFXBench4 tests at 1024x768 on a Kabylake GT2:
- Manhattan 3.1 by 1.32134% +/- 0.322734% (n=8).
- Car Chase by 1.25607% +/- 0.291262% (n=5).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:24 -07:00
Kenneth Graunke
3efedf98e8 i965: Always flush caches after blitting to a GL buffer object.
When we blit data into a buffer object, we may need to invalidate any
caches that might contain stale data, so the new data becomes visible.

For example, if the buffer object is bound as a vertex buffer, we need
to invalidate the vertex fetch cache.

While this flushing was missing, it usually happened implicitly for
non-obvious reasons: we're usually on the render ring, and calling
intel_emit_linear_blit() would require switching to the BLT ring,
causing an implicit flush.  This likely provoked the kernel to do
PIPE_CONTROLs on our behalf.  Although, Gen4-5 wouldn't have this
behavior.  At any rate, we should do it ourselves.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:23 -07:00
Kenneth Graunke
df8f4bfc02 i965: Add PIPE_CONTRTOL_DATA_CACHE flush to brw_emit_mi_flush().
Although we're phasing out brw_emit_mi_flush(), we still use it in some
places in order to "flush everything".  In a number of those places, we
write data to a buffer that we may then bind as an image surface, SSBO,
or atomic buffer.  Those usages require us to flush the data cache.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:22 -07:00
Kenneth Graunke
225425111f i965: Add a brw_blorp_copy_buffers() command.
This exposes the new blorp_copy_buffer() functionality to i965.
It should be a drop-in replacement for intel_emit_linear_blit()
(other than the arguments being backwards, for consistency with BLORP).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:21 -07:00
Kenneth Graunke
fc20df830c blorp: Make blorp_buffer_copy work on Gen4-6.
Gen4-6 can only handle surfaces up to 8192.  Only Gen7+ can do 16384.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:19 -07:00
Kenneth Graunke
81d5b61a19 blorp: Turn anv_CmdCopyBuffer into a blorp_buffer_copy() helper.
I want to be able to copy between buffer objects using BLORP in the i965
driver.  Anvil already had code to do this, in a reasonably efficient
manner - first using large bpp copies, then smaller bpp copies.

This patch moves that logic into BLORP as blorp_buffer_copy(), so we
can use it in both drivers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-30 16:59:07 -07:00
Grazvydas Ignotas
b8dd69e1b4 radv: don't assert on empty hash table
Currently if table_size is 0, it's falling through to:

unreachable("hash table should never be full");

But table_size can be 0 when RADV_DEBUG=nocache is set, or when the
table allocation fails (which is not considered an error).

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-31 02:47:26 +03:00
Brian Paul
5610911fed svga: include sample count in surface_size() computation
Use MAX2() because sampleCount will be zero for non-MSAA surfaces.
No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-30 13:59:14 -06:00
Lionel Landwerlin
350ead0f26 i965: drop unused brw->needs_unlit_centroid_workaround
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
b1c9ed25a5 i965: drop brw->has_surface_tile_offset in favor of devinfo's
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
aff1ad0798 i965: drop unused brw->no_simd8
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
6da7a00a84 i965: drop unused brw->has_pln
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
cbee3b03c9 i965: drop brw->must_use_separate_stencil in favor of devinfo's
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
40d20699b7 i965: drop unused brw->has_negative_rhw_bug
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
71493b320d i965: drop unused brw->has_compr4
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
a5f0821485 i965: drop brw->has_llc in favor of devinfo->has_llc
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
27e273578f i965: drop brw->is_broxton
We need to take some take here as brw->is_broxton has been used to
check whether the device is a low power gen9 (aka Atom gen9 platform).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:18 +01:00
Lionel Landwerlin
b6e783300c i965: drop brw->is_cherryview in favor of devinfo->is_cherryview
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
97e90113c6 i965: drop brw->is_haswell in favor of devinfo->is_haswell
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
d324197de9 i965: drop brw->is_baytrail in favor of devinfo->is_baytrail
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
990c24ad85 i965: drop brw->is_g4x in favor of devinfo->is_g4x
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
46213f676e i965: drop brw->gt in favor of devinfo->gt
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
b83a97a65d i965: drop brw->gen in favor of devinfo->gen
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Lionel Landwerlin
de9649071a anv: use device->info instead of brw->is_*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 17:59:17 +01:00
Mark Janes
8c9df0daf2 Revert "egl: Allow creation of per surface out fence"
This reverts commit 13c23b19d0.

Mesa CI was brought down by this commit, with:

mesa/drivers/dri/i965/brw_sync.c:491: brw_dri_create_fence_fd:
Assertion `brw->screen->has_exec_fence' failed.
2017-08-30 08:45:36 -07:00
Kevin Rogovin
783f2b70c0 i965: add 2xMSAA 16xMSAA modes to DRI configs.
For Gen8, add 2xMSAA. For Gen9, add 2xMSAA and 16xMSAA.
Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.

V2: Make pointer name less ugly + add 2xMSAA for Gen8

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-30 08:21:38 -07:00
Kenneth Graunke
d808f44ae5 Revert "i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9."
This reverts commit f6d38785e8.

Kevin's original patch accidentally didn't add 2x for Gen8; he sent
a v2 with a bunch of style fixes shortly after I pushed the original
patch, not knowing it was coming.  Let's just revert this one, apply
v2, and move on.
2017-08-30 08:21:38 -07:00
Eric Engestrom
ac0d8dc3fa mesa/st: remove unwanted backup file
Fixes: 0ac78dc925 "util: move string_to_uint_map to glsl"
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 14:52:16 +01:00
Michael Olbrich
81d5c31631 egl/dri2: only destroy created objects
dri2_display_destroy may be called by dri2_initialize_wayland_drm() if
initialization fails. In this case, these objects may not be initialized.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-30 14:06:49 +01:00
Zhongmin Wu
13c23b19d0 egl: Allow creation of per surface out fence
Add plumbing to allow creation of per display surface out fence.

Currently enabled only on android, since the system expects a valid
fd in ANativeWindow::{queue,cancel}Buffer. We pass a fd of -1 with
which native applications such as flatland fail. The patch enables
explicit sync on android and fixes one of the functional issue for
apps or buffer consumers which depend upon fence and its timestamp.

v2: a) Also implement the fence in cancelBuffer.
    b) The last sync fence is stored in drawable object
       rather than brw context.
    c) format clear.

v3: a) Save the last fence fd in DRI Context object.
    b) Return the last fence if the batch buffer is empty and
       nothing to be flushed when _intel_batchbuffer_flush_fence
    c) Add the new interface in vbtl to set the retrieve fence

v3.1 a) close fd in the new vbtl interface on none Android platform

v4: a) The last fence is saved in brw context.
    b) The retrieve fd is for all the platform but not just Android
    c) Add a uniform dri2 interface to initialize the surface.

v4.1: a) make some changes of variable name.
      b) the patch is broken into two patches.

v4.2: a) Add a deinit interface for surface to clear the out fence

v5: a) Add enable_out_fence to init, platform sets it true or
       false
    b) Change get fd to update fd and check for fence
    c) Commit description updated

v6: a) Heading and commit description updated
    b) enable_out_fence is set only if fence is supported
    c) Review comments on function names
    d) Test with standalone patch, resolves the bug

v6.1: Check for old display fence reverted

v6.2: enable_out_fence initialized to false by default,
      dri2_surf_update_fence_fd updated, deinit changed to fini

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101655

Signed-off-by: Zhongmin Wu <zhongmin.wu@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2017-08-30 11:55:39 +01:00
Samuel Pitoiset
0d9117b7bd winsys/amdgpu: add BO to the global list only when RADEON_ALL_BOS is set
Only useful when that debug option is enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-30 09:33:59 +02:00
Samuel Pitoiset
59101e771d radeonsi: update dirty_level_mask before dispatching
This fixes a rendering issue with Hitman when bindless textures
are enabled.

Fixes: 2263610827 ("radeonsi: flush DB caches only when transitioning from DB to texturing")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-30 09:33:55 +02:00
Juan A. Suarez Romero
a2234614b6 anv: set right datatypes in anv_pipeline_binding
This structure contains two fields, binding and index, that store the
binding in the descriptor set and the index inside the binding.

These structures are defined as uint8_t, but the types in Vulkan
specification are uint32_t, so big values are clamp.

This fixes dEQP-VK.binding_model.shader_access.*.multiple_arbitrary_descriptors.*

v2: use UINT32_MAX for index when having no render targets (Tapani)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-30 08:01:53 +02:00
Brian Paul
88cdf16871 llvmpipe: initialize llvmpipe->dirty with LP_NEW_SCISSOR
If llvmpipe_set_scissor_states() is never called, we still need to be sure
that derived scissor/clip state is updated.  As of commit 743ad599a9
that function might not be called.

Fixes regressed Piglit gl-1.0-scissor-offscreen -fbo -auto test.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101709
Fixes: 743ad599a9 ("st/mesa: don't set 16 scissors and 16 viewports
if they're unused")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-29 20:49:36 -06:00
Kenneth Graunke
df85a0f54e i965: Bump the initial program cache size from 4kB to 16kB.
Our initial size of 4kB is way too small to do anything useful, so we
end up growing it at least a few times.  We may as well start it larger.

Some data points:

- Dinoshade (from Mesa Demos): hit 8kB.
- Chromium 60: hit 16kB after browsing a few things in Google Docs.
- GFXBench4 TRex/Manhattan 3.1: hit 128kB
- Unigine Valley 1.0: hit 512kB

It might make sense to start it even larger.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-08-29 16:45:16 -07:00
Kenneth Graunke
9a09e4684d i965: Issue performance warnings when growing the program cache
This involves a bunch of unnecessary copying, a batch flush, and
state re-emission.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-29 16:45:07 -07:00
Kevin Rogovin
f6d38785e8 i965: add 2xMSAA and 16xMSAA to DRI configs for Gen9.
Special thanks to Eero Tamminen for reporting rasterizer
numbers being twice what it should be for 2xMSAA under
a benchmark.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-29 16:35:15 -07:00
Matt Turner
8b5b6a8abf glsl: define YY_NO_INPUT to prevent unused symbol warnings
Otherwise clang warns:

glsl/glsl_lexer.cpp:3507:16: warning: function 'yyinput' is not needed
and will not be emitted [-Wunneeded-internal-declaration]
    static int yyinput (yyscan_t yyscanner)
               ^

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
37f664a066 blorp: Explicitly cast between different enums
Fixes warnings like

warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
         .SourceElementFormat = ISL_FORMAT_R32_UINT,
                                ^~~~~~~~~~~~~~~~~~~

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
b962922fb7 intel/isl: Mark functions used conditionally as UNUSED
The functions we're marking as UNUSED in isl_surface_state.c are used
only when compiling for particular generations.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
c4ce12728e intel/isl: Explicitly cast between different enums
Fixes warnings like

warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
         .SourceElementFormat = ISL_FORMAT_R32_UINT,
                                ^~~~~~~~~~~~~~~~~~~

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
9fdbc273ef intel/isl: Remove 'inline' keywords
Unless you have data, the compiler knows better than you whether a
function should be inlined.

Unlike all other cases in this series, the removal of the inline keyword
from isl_format_has_channel_type actually changes the resulting binary
with gcc-6.3.0:

   text	   data	    bss	    dec	    hex	filename
7831116	 346384	 420648	8598148	 833284	i965_dri.so before
7830716	 346384	 420648	8597748	 8330f4	i965_dri.so after

I think this is likely an improvement. No difference in the resulting
binary with clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
cdbaa8a12f anv: Mark functions used conditionally as UNUSED
The functions we're marking as UNUSED in genX_pipeline.c are used only
when compiling for particular generations.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
5d4afef459 anv: Explicitly cast between different enums
Fixes warnings like

warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
         .SourceElementFormat = ISL_FORMAT_R32_UINT,
                                ^~~~~~~~~~~~~~~~~~~

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
6cfc49287d anv: Remove 'inline' keywords
Unless you have data, the compiler knows better than you whether a
function should be inlined.

No difference in the resulting binary with gcc-6.3.0 or clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
012887ef48 anv: Use GNU C empty brace initializer
Avoids Clang's warning about the current code:

   warning: suggest braces around initialization of subobject

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
e99dcfd803 i965: Add $(WNO_OVERRIDE_INIT) to AM_CFLAGS
brw_surface_formats.c and genX_blorp_exec.c do this a lot, causing lots
of warnings from clang.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
477ac09c9e i965: Mark functions used conditionally as UNUSED
The functions we're marking as UNUSED in genX_state_upload.c are used
only when compiling for particular generations.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
239bbdfaa2 i965: Explicitly cast between different enums
Fixes warnings like

warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
         .SourceElementFormat = ISL_FORMAT_R32_UINT,
                                ^~~~~~~~~~~~~~~~~~~

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
dff75c7175 i965: Drop unnecessary conditional
Clang doesn't realize that 0 and 1 are the only possibilities, a thinks
lots of variables might be uninitialized.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
35902f47eb i965: Remove some 'inline' keywords
brw_texture_view_sane() is only used by an assert()...

No difference in the resulting binary with gcc-6.3.0 or clang-4.0.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
50e4099edf nir: Remove series of unnecessary conversions
Clang warns:

warning: absolute value function 'fabsf' given an argument of type
'const float64_t' (aka 'const double') but has parameter of type 'float'
which may cause truncation of value [-Wabsolute-value]

            float64_t dst = bit_size == 64 ? fabs(src0) : fabsf(src0);

The type of the ternary expression will be the common type of fabs() and
fabsf(): double. So fabsf(src0) will be implicitly converted to double.
We may as well just convert src0 to double before a call to fabs() and
remove the needless complexity, à la

            float64_t dst = fabs(src0);

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
02ba0d5a7b nir/spirv: Use unreachable("...") rather than assert(!"...")
Quiets a number of uninitialized variable warnings in clang.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
f99bde0dad compiler: Add $(WNO_OVERRIDE_INIT) to AM_CFLAGS
nir_intrinsics.h does this a lot, causing lots of warnings from clang.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
c5d2e2d43f configure: Test for -Wno-initializer-overrides
Clang has "-Wno-initializer-overrides", while gcc has
"-Wno-override-init". Quiets a lot of warnings with clang.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Matt Turner
e4b2b69e82 configure: Add and use AX_CHECK_COMPILE_FLAG
This makes it a lot clearer what's happening (at least I think so), and
will make future additions much simpler.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Bas Nieuwenhuizen
083b49ba9d radv: Add trace ids for secondary buffers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-29 23:06:03 +02:00
Bas Nieuwenhuizen
46dd30d08f ac/debug: Support multiple trace ids for nested IBs.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-29 23:05:59 +02:00
Bas Nieuwenhuizen
43eb761cad radv/amdgpu: Enable dumping of all IBs with RADV_DEBUG=allbos.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-29 23:05:55 +02:00
Emil Velikov
9e07005e87 egl/wayland: make sure HAS_$FORMAT is set for wl_dmabuf
Otherwise eglCreateWaylandBufferFromImageWL will fail, since we
have no "supported" format.

Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:43:04 +01:00
Emil Velikov
293d64e96f egl/wayland: group wl_win specific code together
Make the code a bit easier to follow. There should be no functional
change since none of the bits set are accessible until the
eglCreateWindowSurface call is complete.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:43:04 +01:00
Emil Velikov
10efd6fe7e egl/wayland: remove dri2_surf width/height double init.
The dimensions are already set [to 0 or the value provided by the
attributes list] by the _eglInitSurface() call further up.

The values are updated, as the DRI driver calls the DRI2/IMAGE_LOADER'
get_buffers, shortly before making use of the values.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:43:04 +01:00
Emil Velikov
da100fe697 egl/wayland: set correct format with wl_dmabuf as wl_drm is missing
For most/all cases today, we have wl_drm available alongside wl_dmabuf.
Yet in the long run, we want to make sure the latter can operate without
any traces of the former.

Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:43:04 +01:00
Emil Velikov
6a1b683e74 egl/wayland: update comment to reflect wl_dmabuf presence
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:43:04 +01:00
Emil Velikov
1a8015e753 egl/wayland: polish object teardown in dri2_wl_destroy_surface
The wl_drm wrapper is created before the wl display/surface ones.
Thus make sure we destroy it after them. In reality it should not make
any difference either way.

Fixes: 03dd9a88b0 ("egl/wayland: Use per-surface event queues")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:41:06 +01:00
Emil Velikov
83442112d7 egl/wayland: plug leaks in dri2_wl_create_window_surface() error path
We forgot to teardown the wl display/surface wrappers.

Fixes: 03dd9a88b0 ("egl/wayland: Use per-surface event queues")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:40:49 +01:00
Emil Velikov
2f76dff65f egl: simplify refcounting after screen creation
If the specific initialize was successfull, dri2_egl_display() will
return a non NULL pointer. Thus we can drop the check and flatten the
codeflow.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-08-29 13:40:46 +01:00
Emil Velikov
0ac78dc925 util: move string_to_uint_map to glsl
The functionality is used by glsl and mesa. With the latter already
depending on the former.

With this in place the src/util/ static library libmesautil.la no longer
has a C++ dependency. Thus objects which use it (like libEGL) don't need
the C++ link.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101851
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: James Harvey <lothmordor@gmail.com>
2017-08-29 13:40:44 +01:00
Marek Olšák
79674066b6 st/mesa: fix XPD lowering - don't read dst
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102461

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-08-29 13:18:37 +02:00
Jason Ekstrand
43e8808b82 anv: Add support for the SYNC_FD handle type for fences
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
49c59c88eb anv: Implement VK_KHR_external_fence
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
5f372d93a9 anv: Use DRM sync objects to back fences whenever possible
In order to implement VK_KHR_external_fence, we need to back our fences
with something that's shareable.  Since the kernel wait interface for
sync objects already supports waiting for multiple fences in one go, it
makes anv_WaitForFences much simpler if we only have one type of fence.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
d21c151091 anv/gem: Add support for syncobj wait and reset
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
144487ebb8 anv/gem: Add a flags parameter to syncobj_create
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:43 -07:00
Jason Ekstrand
3cd26d981b drm-uapi: Update headers from drm-next
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 19:33:35 -07:00
Jason Ekstrand
ae8365a9eb vulkan/util: Add a vk_zalloc helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:33 -07:00
Jason Ekstrand
caa71343c6 anv: Rename anv_fence_state to anv_bo_fence_state
It only applies to legacy BO fences.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:30 -07:00
Jason Ekstrand
92286dc08a anv: Pull the guts of anv_fence into anv_fence_impl
This is just a refactor, similar to what we did for semaphores, in
preparation for handling VK_KHR_external_fence.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:27 -07:00
Jason Ekstrand
738e5e3c1d anv/wsi: Use QueueSubmit to trigger the fence in AcquireNextImage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:25 -07:00
Jason Ekstrand
f992bb205c anv: Rework fences to work more like BO semaphores
This commit changes fences to work a bit more like BO semaphores.
Instead of the fence being a batch, it's simply a BO that gets added
to the validation list for the last execbuf call in the QueueSubmit
operation.  It's a bit annoying finding the last submit in the execbuf
but this allows us to avoid the dummy execbuf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:35:22 -07:00
Jason Ekstrand
2eacfdeec9 anv/queue: Allow temporary import of SYNC_FD semaphores
We didn't allow them before because it didn't look like the spec allowed
it.  It certainly doesn't make much sense.  However, there are CTS tests
that apparently hit this.  What the spec actually says is:

    "Importing a payload using handle types with copy transference
    creates a duplicate copy of the payload at the time of import, but
    makes no further reference to it. Fence signaling, waiting, and
    resetting operations performed on the target of copy imports must
    not affect any other fence or payload."

A SYNC_FD has copy transference but the import may be temporary or
permanent.  If you do a permanent import of something with copy
transference, I guess it's supposed to work and end up resetting the
permanent state.  In any case, there seems to be no real harm in
allowing it, so why not.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-28 18:34:06 -07:00
Kenneth Graunke
a106ae111c i965: Fix whitespace issues in intel_buffer_objects.c.
Convert tabs to spaces and rewrap one long line.
2017-08-28 17:11:02 -07:00
Timothy Arceri
0168d1f449 radeonsi: stop leaking nir
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-29 09:46:29 +10:00
Grazvydas Ignotas
29f46488cc ac/nir: remove misleading condition
location is never set to INTERP_SAMPLE, and Nicolai comments:
"... that part is misleading. location refers to the base location, not
the final location of the sample, and it can never be INTERP_SAMPLE."

Suggested-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
2017-08-29 01:36:57 +03:00
Grazvydas Ignotas
2b4e31bc9b ac/nir: silence maybe-uninitialized warnings
These are likely false positives, but are also annoying because they
show up on every "make install", which causes ac_nir_to_llvm to be
rebuilt here. Initializing those variables to NULL should be harmless
even when unnecessary.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-29 01:16:58 +03:00
Grazvydas Ignotas
7780374833 radv: clear dynamic_shader_stages on create
Valgrind reports it's being used uninitialized.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-29 01:11:02 +03:00
Grazvydas Ignotas
15800180f3 amd: add .editorconfig
amd/common/ and amd/vulkan/ are using tabs for indent, which doesn't
match the settings in root .editorconfig, so let's override.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-29 01:08:58 +03:00
Marek Olšák
9c92e82b32 radeonsi: rewrite late alloc VS limit computation
This is still very simple, but it's better than before.

Loosely ported from Vulkan.
2017-08-28 21:45:33 +02:00
Marek Olšák
39205f216e gallium/radeon: set EVENT_WRITE_EOP.INT_SEL = wait for write confirmation
Ported from Vulkan.
Not sure what this is good for.. maybe write confirmation from L2 flushes?

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 21:45:33 +02:00
Marek Olšák
61187c1689 gallium/u_threaded: rename IGNORE_VALID_RANGE -> NO_INFER_UNSYNCHRONIZED
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 21:45:33 +02:00
Marek Olšák
28c4c55810 gallium/u_threaded: disallow discard_range if map_buffer is unsynchronized
The discard range codepath takes precedence, so if we get both
unsynchronized and discard_range, choose unsynchronized.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 21:45:32 +02:00
Jason Ekstrand
63e79a8a77 nir: Fix system_value_from_intrinsic for subgroups
A couple of the cases were backwards

Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-28 08:57:52 -07:00
Jason Ekstrand
79d8d6b022 nir: Fix some whatespace
Somehow tabs got in there...

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-28 08:57:31 -07:00
Marek Olšák
f173efe916 radeonsi: correct maximum wave count per SIMD
v2: don't special-case Tonga and Iceland.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 16:33:24 +02:00
Andres Gomez
ff430ec4fd docs: update calendar, add news item and link release notes for 17.1.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-28 16:31:13 +03:00
Andres Gomez
a26dccd131 docs: add sha256 checksums for 17.1.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 44e008e85e)
2017-08-28 16:29:00 +03:00
Andres Gomez
0444024556 docs: add release notes for 17.1.8
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit e644f9996b)
2017-08-28 16:28:59 +03:00
Ilia Mirkin
ae53bff8b1 st/mesa: fix handling of vertex array double inputs
The is_double_vertex_input needs to be set for arrays of doubles as
well.

Fixes KHR-GL45.enhanced_layouts.varying_array_locations

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-08-28 09:07:40 -04:00
Ilia Mirkin
eefeff09a7 glsl: fix counting of vertex shader output slots used by explicit vars
The argument to count_attribute_slots should only be set to true for
vertex inputs, not for all vertex shader varyings.

Fixes KHR-GL45.enhanced_layouts.varying_locations

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-28 09:07:40 -04:00
Topi Pohjolainen
5dd072380a intel/compiler: Cast reg types explicitly
Makes coverity happier.

CID: 1416799
Fixes: c1ac1a3d25 (i965: Add a brw_hw_type_to_reg_type() function)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-28 14:43:39 +03:00
Gwan-gyeong Mun
c261bc11e6 gallium/docs: Fix an inequality sign of TGSI_SEMANTIC_SUBGROUP_LT_MASK
A previous expression presents same as TGSI_SEMANTIC_SUBGROUP_GT_MASK.
It fixes a direction of an inequality for TGSI_SEMANTIC_SUBGROUP_LT_MASK.

before:
  bit index > TGSI_SEMANTIC_SUBGROUP_INVOCATION

after:
  bit index < TGSI_SEMANTIC_SUBGROUP_INVOCATION

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 12:05:44 +02:00
Samuel Pitoiset
5ba443b246 radv: propagate VK_ERROR_OUT_OF_HOST_MEMORY to vk{Begin,End}CommandBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-28 11:25:47 +02:00
Samuel Pitoiset
2bc3d65690 radv: rename record_fail to record_result and use VkResult
This will allow to propagate VK_ERROR_OUT_OF_HOST_MEMORY to
vkEndCommandBuffer() when necessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-28 11:25:44 +02:00
Gwan-gyeong Mun
db91b8536e gallium/docs: fix a typo
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-28 10:33:42 +02:00
Eduardo Lima Mitev
1d8111ebac i915g: Remove a few unused variables
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-28 08:59:50 +02:00
Timothy Arceri
2422124f6e disk_cache: assert if a cache entries keys don't match mesa
In ef42423e7b I enabled the check for release builds however we
still want to assert in debug builds in case of collisions or
just general bugs with the key building/compare code. Otherwise
it will just fail silently effectively disabling the cache.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-08-28 09:29:15 +10:00
Marek Olšák
d500c9b060 Revert "radeonsi: get the raster config from AMDGPU on SI"
This reverts commit fc99cb3c9e.

"The performance went down from 64.7 to 51.4 fps in Valley and from 30.8 to
25.1 fps in Heaven on Radeon HD 7970. Other games seem to have also a 10-25%
performance decrease."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102429

It looks like we can't use the raster config values from the kernel.
2017-08-27 22:27:23 +02:00
Dave Airlie
9573bd70e1 radv/wsi: Compute correct row_pitch for GFX9.
(commit split out by Bas Nieuwenhuizen)

Fixes: 65477bae9c "radv: enable GFX9 on radv"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-27 01:19:27 +02:00
Christian Gmeiner
67fc3e37a7 etnaviv: use correct param for etna_compatible_rs_format(..)
Found by code inspection.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-26 17:20:39 +02:00
Emil Velikov
f0d053cb6d egl: don't NULL deref the .get_capabilities function pointer
One could easily introduce version 3 of the DRI2fenceExtension,
extending the struct, while not implementing the above function.

Thus we'll end up with NULL pointer, and dereferencing it won't fare
too well.

Fixes: 0201f01dc4 ("egl: add EGL_ANDROID_native_fence_sync")
Cc: Rob Clark <robclark@freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-26 11:22:17 +01:00
Emil Velikov
10524d105d mapi/gen: remove shebang from the marshal generator scripts
The scripts are invoked with the correct version of python and are
missing the execute bit.

Follow the rest of Mesa and drop the shebang line.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-26 11:20:08 +01:00
Emil Velikov
e396265368 dri_interface.h: add missing stdint.h include
Required for uint32_t and friends.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-26 11:20:07 +01:00
Emil Velikov
98030f92e8 xmlconfig: use the portable __VA_ARGS__
Follow the example used through mesa and use "..." + "__VA_ARGS__".
The former tends to be more common and portable.

v2: use ##__VA_ARGS__ (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-26 11:20:06 +01:00
Brian Paul
d819b1fcec gallium/vbuf: fix buffer reference bugs
In two places we called pipe_resource_reference() to remove a reference
to a vertex buffer resource.  But we neglected to check if the buffer was
a user buffer and not a pipe_resource.  This caused us to pass an invalid
pipe_resource pointer to pipe_resource_reference().

Instead of calling pipe_resource_reference(&vbuf->resource, NULL), use
pipe_vertex_buffer_unreference(&vbuf) which checks the is_user_buffer
field and does the right thing.

Also, explicity set the is_user_buffer field to false after setting the
vbuf->resource pointer to out_buffer.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102377
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-25 20:26:52 -06:00
Andres Gomez
42d62e61bc docs: add an additional final cycle for 17.1
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-08-26 00:49:50 +03:00
Andres Gomez
8e07ad1e31 docs: remove released and extend the calendar until the end of 2017
Completed the 17.2 cycle and added the beginning of the 17.3 one.

v2: Add 17.2-rc6 as tentative final version to be promoted to 17.2.0
    final (Eric).

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-08-26 00:47:40 +03:00
Bas Nieuwenhuizen
9b7e663da1 radv: Fix sparse BO mapping merging.
If we merge a mapping with the mapping before it, we also need
to not only change the offset, but also the bo offset.

Fixes: 715df30a4e "radv/amdgpu: Add winsys implementation of virtual buffers."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-25 22:47:49 +02:00
Bas Nieuwenhuizen
fba0e07869 radv: Fix off by one in MAX_VBS assert.
e.g. 0 + 32 <= 32 should be valid.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-25 22:47:49 +02:00
Bas Nieuwenhuizen
bd81cb3206 radv: Don't set a new subpass on compute resolve.
We don't use the render path so totally unneeded.

Fixes: 19be95f71e "radv: add subpass resolve compute path"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-25 22:47:49 +02:00
Bas Nieuwenhuizen
e5c4e10769 radv: Remove some intel comments from the resolve code.
These are clearly not applicable to radv.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-25 22:47:49 +02:00
Adam Jackson
cd8ab40cd4 egl/drm: Don't "fall back" to /dev/dri/card0 if the first open fails
The snprintf stuff here already constructs the right name for the device
node, and if it doesn't, you configured Mesa wrong, don't do that.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-25 16:21:43 -04:00
Kenneth Graunke
e8378adc01 i965: Use GEN_GEN and GEN_IS_HASWELL in genX_state_upload.c code.
We were using brw->gen, brw->is_haswell, and devinfo->gen in a few
places, when we could just use GEN_GEN and GEN_IS_HASWELL, which are
evaluated at compile time.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-08-25 11:40:43 -07:00
Rafael Antognolli
1eb58960bf i965: Do not store SRC after 0 on component control.
The PRM SKL-Vol 2b-05.16 says:

   "Within a VERTEX_ELEMENT_STATE structure, if a Component Control
   field is set to something other than VFCOMP_STORE_SRC, no
   higher-numbered Component Control fields may be set to
   VFCOMP_STORE_SRC. In other words, only trailing components can be set
   to something other than VFCOMP_STORE_SRC."

Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and
VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3,
let's also set them to VFCOMP_STORE_0.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-08-25 10:02:09 -07:00
Adam Jackson
2bae451bd3 mesa: Implement GL_ARB_polygon_offset_clamp
Semantically identical to the EXT version (whose string is still valid
for GLES), so rename the bit but expose both extension strings.
(Suggested by Ilia Mirkin and Ian Romanick.)

v3: Fix the entrypoint alias in GL4x.xml (Ilia)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-25 12:38:14 -04:00
Adam Jackson
00caf2ab08 mesa: Implement GL_ARB_texture_filter_anisotropic
The only difference from the EXT version is bumping the minmax to 16, so
just hit all the drivers at once.

v2: Fix driver names, add to 17.3 release notes (Ilia Mirkin)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-25 12:38:01 -04:00
Marek Olšák
ddc9b4e823 gallium/u_threaded: fix a typo 2017-08-25 15:40:28 +02:00
Eric Engestrom
79ee1b2ff0 khronos/egl: remove dependency on Android NDK header
Khronos: https://github.com/KhronosGroup/EGL-Registry/pull/22
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-25 14:19:53 +01:00
Eric Engestrom
88eb19cf79 Revert "egl/android: add missing include"
This reverts commit 688d866eca.

The include I added in 688d866eca isn't actually useful, as it only
declares the opaque struct ANativeWindow.
However, this caused build issues for android-x86 [1] due to the header
being moved in Android O.

[1] https://lists.freedesktop.org/archives/mesa-dev/2017-August/167626.html

Fixes: 688d866eca "egl/android: add missing include"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-25 14:03:48 +01:00
Samuel Pitoiset
fff1327547 mesa: add KHR_no_error support to glBindBufferOffsetEXT()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
dc058f850c mesa: add bind_buffer_offset() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
83690d4590 mesa: add KHR_no_error support to glTransformFeedbackVaryings()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
a5319d9fde mesa: add transform_feedback_varyings() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
4b5140d20b mesa: add KHR_no_error support to glResumeTransformFeedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
f0476e0020 mesa: add resume_transform_feedback() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:30 +02:00
Samuel Pitoiset
1c88ed9558 mesa: add KHR_no_error support to glPauseTransformFeedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
061a1eebe1 mesa: add pause_transform_feedback() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
08cecec3c0 mesa: add KHR_no_error support to glEndTransformFeedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
654587696b mesa: add end_transform_feedback() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
3906e8ab64 mesa: add KHR_no_error support to glBeginTransformFeedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
088d5cb44f mesa: add begin_transform_feedback() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:35:29 +02:00
Samuel Pitoiset
b0590ace75 mesa: add KHR_no_error support to glBindTransformFeedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:24:17 +02:00
Samuel Pitoiset
efb9811680 mesa: add bind_transform_feedback() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:24:08 +02:00
Samuel Pitoiset
5946806064 mesa: port the LastLookedUpVAO optimisation to _mesa_lookup_vao()
It was only used in the errors path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:54 +02:00
Samuel Pitoiset
08ee28b6a8 mesa: don't error check the default buffer object in glBindBufferOffsetEXT()
An allocation check is already done when the buffer is created at
context creation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:51 +02:00
Samuel Pitoiset
c7b201a50d mesa: add _fallback suffix to the default transform feedback functions
In preparation for KHR_no_error support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:49 +02:00
Samuel Pitoiset
4f532ab30e mesa: remove unnecessary check in _mesa_init_transform_feedback_object()
All callers already check that, and the common behaviour is to
check in the _mesa_new_XXX() helpers anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:46 +02:00
Samuel Pitoiset
41c7c2d968 mesa: check allocation failures in new_transform_feedback()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:44 +02:00
Samuel Pitoiset
dd53bdd5aa mesa: remove unused _mesa_validate_transform_feedback_buffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 11:11:42 +02:00
Kai Chen
151188d1e3 egl/wayland: Use roundtrips when awaiting buffer release
In get_back_bo, we use wl_display_dispatch_queue() to block and wait for
a buffer release event. However, not all Wayland compositors flush the
client socket on posting a buffer-release event, so by only blocking
client-side, we may block indefinitely, or at least need to wait for an
input event / frame completion to arrive for the compositor to flush.

We now use dispatch_queue as a first pass, but if our entire buffer pool
is exhausted, use a roundtrip (an immediately-triggered wl_callback) to
ensure that the compositor flushes out our release event immediately.

[daniels: Modified comment and commit message.]

Signed-off-by: Kai Chen <kai.chen@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-08-25 09:57:03 +01:00
Nicolai Hähnle
4da6cf6c98 glsl: fix glsl_struct_field size calculations for shader cache
Found by address sanitizer:

==22621==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61400000cbd8 at pc 0x7f561610a4ff bp 0x7ffca85f9d50 sp 0x7ffca85f94f8
READ of size 344 at 0x61400000cbd8 thread T0
    #0 0x7f561610a4fe  (/usr/lib/x86_64-linux-gnu/libasan.so.3+0x5f4fe)
    #1 0x7f560bb305a5 in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
    #2 0x7f560bb305a5 in blob_write_bytes ../../../mesa-src/src/compiler/glsl/blob.c:136
    #3 0x7f560be7d7ff in encode_type_to_blob ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:153
    #4 0x7f560be81222 in write_program_resource_data ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:950
    #5 0x7f560be81222 in write_program_resource_list ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1118
    #6 0x7f560be81222 in shader_cache_write_program_metadata(gl_context*, gl_shader_program*) ../../../mesa-src/src/compiler/glsl/shader_cache.cpp:1407
    #7 0x7f560b825fdb in link_program ../../../mesa-src/src/mesa/main/shaderapi.c:1163

Fixes: 073a84ff60 ("glsl: stop adding pointers from glsl_struct_field to the cache")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-25 09:05:28 +02:00
Ilia Mirkin
f623e1742f a2xx: fix DST_ALPHA blending for non-alpha formats
If we're rendering to a format without alpha, convert DST_ALPHA blend to
a ONE so that factors are properly computed. This same workaround is
done on a3xx+ as well.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-25 00:18:34 -04:00
Ilia Mirkin
f3bde890cd a2xx: set constant blend color
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-25 00:18:33 -04:00
Timothy Arceri
f40908f2d1 radeonsi: set IF_THRESHOLD to 4
In 74e39de932 it was set to 3 and it was reported that 4 caused
tesseract to start spilling VGPRs. This no longer seems to be the
case.

Totals:
SGPRS: 2787844 -> 2787764 (-0.00 %)
VGPRS: 1713121 -> 1712717 (-0.02 %)
Spilled SGPRs: 7532 -> 7532 (0.00 %)
Spilled VGPRs: 49 -> 33 (-32.65 %)
Private memory VGPRs: 2060 -> 2060 (0.00 %)
Scratch size: 2200 -> 2180 (-0.91 %) dwords per thread
Code Size: 79265520 -> 79248360 (-0.02 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 670535 -> 670608 (0.01 %)
Wait states: 0 -> 0 (0.00 %)

Before:
 VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
 EffectsCaveDemo          301         0       256       264
 ReflectionsSubwayDemo    264         0       256       264
 VehicleGame              295         0       128       132
 bioshock-infinite       1140         0       448       516
 dirt-showdown            453        33         0        28
 gang-beasts              364         0       500       496
 kerbal-space-program    1228         0       472       480
 tomb-raider-ultra       1199        16         0        20

After:
 VGPR SPILLING APPS   Shaders SpillVGPR  PrivVGPR ScratchSize
 EffectsCaveDemo          301         0       256       264
 ReflectionsSubwayDemo    264         0       256       264
 VehicleGame              295         0       128       132
 bioshock-infinite       1140         0       448       516
 dirt-showdown            453        33         0        28
 gang-beasts              364         0       500       496
 kerbal-space-program    1228         0       472       480

The only change in VGPR spills is the elimination of all spills
in Tomb Raider at Ultra settings. Closer examination shows that
the shaders go over the limit because they contain three
expressions a mul, rcp and ubo load. The ubo load is actually
used elsewhere and is therefore stored in a temp already in IR
such as tgsi but glsl ir counts it agaist the if cost.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-08-25 14:09:32 +10:00
Timothy Arceri
b86ecea344 util/disk_cache: write cache item metadata to disk
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Timothy Arceri
ea2515d780 glsl: pass shader source keys to the disk cache
We don't actually write them to disk here. That will happen in the
following commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Timothy Arceri
07018d49dc util/disk_cache: add struct cache_item_metadata
This will be used to store more information about the cache item
in it's header. This information is intended for 3rd party and
cache analysis use but can also be used for detecting the unlikely
scenario of cache collisions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Timothy Arceri
ef42423e7b disk_cache: enable limited hash collision detection in release builds
It really doesn't cost us much and will stop strange crashes should
the stars align.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Timothy Arceri
28b326238b util/disk_cache: rename mesa cache dir and introduce cache versioning
Steam is already analysing cache items, unfortunatly we did not
introduce a versioning mechanism for identifying structural changes
to cache entries earlier so the only way to do so is to rename the
cache directory.

Since we are renaming it we take the opportunity to give the directory
a more meaningful name.

Adding a version field to the header of cache entries will help us to
avoid having to rename the directory in future. Please note this is
versioning for the internal structure of the entries as defined in
disk_cache.{c,h} as opposed to the structure of the data provided to
the disk cache by the GLSL compiler and the various driver backends.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-25 13:20:29 +10:00
Dave Airlie
4a091b0788 radv: don't crash if we have no framebuffer
Recording secondaries with no framebuffer attachment may
make this happen, though this might not be the complete solution.

(esp if someone does meta stuff in there, would we have to
save things, not sure).

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-25 00:52:48 +01:00
Dave Airlie
19f6906c1e radv/gfx9: gfx9 has buffer sizing rules like pre-VI.
This fixes:
dEQP-VK.robustness.buffer_access.* on GFX9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-25 00:52:36 +01:00
Dave Airlie
12fd0f8dc1 radv: fix predication on gfx9
When I added gfx9 I did it wrong, this fixes it.

Fixes: 5247b311e9 "radv/gfx9: fix set predication packet."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-25 00:52:32 +01:00
Jason Ekstrand
95f533d922 anv,i965: Move CS shared lowering into anv
Right now, OpenGL uses the GLSL lowering for shared variables and anv
uses NIR to lower them.  For a long time, we've done this weird thing
where we do the NIR lowering unconditionally and then add the SLM sizes
from the two together.  This works because one of them will always be 0
but it's a bit sketchy.  Let's just move the NIR-based lowering into
anv_pipeline and get rid of the sketch.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-24 16:34:29 -07:00
Mauro Rossi
725741f10d ac/debug: use util_strchrnul() to fix android build error
Similar to e09d04cd56 "radeonsi: use util_strchrnul() to fix android build error"

Android Bionic does not support strchrnul() string function,
gallium auxiliary util/u_string.h provides util_strchrnul()

This change avoids the following warning and error:

external/mesa/src/amd/common/ac_debug.c:501:15: warning: implicit declaration of function 'strchrnul' is invalid in C99
                char *end = strchrnul(out, '\n');
                            ^
external/mesa/src/amd/common/ac_debug.c:501:9: error: incompatible integer to pointer conversion initializing 'char *' with an expression of type 'int'
                char *end = strchrnul(out, '\n');
                      ^     ~~~~~~~~~~~~~~~~~~~~
1 warning and 1 error generated.

Fixes: c2c3912410 "ac/debug: annotate IB dumps with the raw values"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-24 17:23:24 -05:00
Marek Olšák
fc99cb3c9e radeonsi: get the raster config from AMDGPU on SI
Not sure yet if we wanna do this on CIK and VI too.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Marek Olšák
28d5c30179 radeonsi: clean up setting GRBM_GFX_INDEX
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Marek Olšák
0b50f0915b radeonsi: move PA_SC_RASTER_CONFIG emission into a separate function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-24 23:54:55 +02:00
Rob Herring
7db0bf8b31 Android: fix Android O version check for LLVM
With the release of O, the MESA_ANDROID_MAJOR_VERSION has changed to 8.
Change the LLVM check to match. There's no point to continue to support 'O'
as no one is going to use an old AOSP master.

Presumably, we'll be back here again to fix things again for P (or 9).

Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-24 15:04:37 -05:00
Adam Jackson
5d2205fafb include: Sync Khronos headers for OpenGL 4.6
Taken from c21e602b9fda1d3bbaecb08194592f67e6a0649b from
OpenGL-Registry. (This time without breaking glext.h.)

Signed-off-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-24 13:47:18 -04:00
Bas Nieuwenhuizen
ba51ad2f25 radv: Expose VK_KHX_multiview.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
e3265c10c8 radv: Implement multiview draws.
v2: - Use for_each_bit.
    - split emitting the draw packets out to separate functions.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
db8e99f72d radv: Implement determining the has_multiview_view_index key.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
180c1b924e ac/nir: Add shader support for multiviews.
It uses an user SGPR to pass the view index to the shaders, except
for the fragment shader where we use layer=view (which comes in
handy when we want to do the NV ext that allows us to execute pre-FS
stages once instead of per view).

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
2e86f6b259 radv: Add multiview clears.
v2: Use for_each_bit.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
3907d63259 radv: Store multiview info in renderpass.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
eec5578158 ac/nir: Make shader key a struct.
Some bits can be passed to almost every shader, and I don't like
adding 5 variables.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
64164a1313 radv: Use 0 for the layer id if the vertex shader does not export it.
To use when we have e.g. input attachments, but there is no layer
export in the previous shader and hence no layered rendering.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
3d5f29f5f9 ac/nir: Implement input attachments with layered rendering.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
c848e642d2 ac/nir: Determine if input attachments are used in the info pass.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Bas Nieuwenhuizen
43595db302 ac/nir: Cast sources of integer ops to int.
The int32->float semantic conversion got dropped in a testcase,
because the src was already float. On closer inspection I decided
to add a few more casts for integer op operands to be safe too.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Adam Jackson
9e45440833 Revert "include: Sync Khronos headers for OpenGL 4.6"
Broke the BUILDING_MESA bit, oops.

This reverts commit ef1e87e6cd.
2017-08-24 13:15:15 -04:00
Adam Jackson
ef1e87e6cd include: Sync Khronos headers for OpenGL 4.6
Taken from c21e602b9fda1d3bbaecb08194592f67e6a0649b from
OpenGL-Registry.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-24 12:48:30 -04:00
Eric Engestrom
39f3e2507c dri: fix typo
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-24 16:14:34 +01:00
Eric Engestrom
55db2b6cfb i965: add missing const in function signature
Gets rid of a few warnings of the form:
  src/mesa/drivers/dri/i965/intel_screen.c:918:49: warning: passing argument 2 of ‘modifier_is_supported’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
          !modifier_is_supported(&screen->devinfo, f, 0, modifier))
                                                   ^
  src/mesa/drivers/dri/i965/intel_screen.c:301:1: note: expected ‘struct intel_image_format *’ but argument is of type ‘const struct intel_image_format *’

Fixes: 1efd73df39 "i965: Advertise the CCS modifier"
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-24 16:11:45 +01:00
Eric Engestrom
688d866eca egl/android: add missing include
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2017-08-24 16:02:14 +01:00
Brian Paul
fe2f5cfdc7 vbo: fix glVertexAttrib(index=0)
Depending on which extension or GL spec you read the behavior of
glVertexAttrib(index=0) either sets the current value for generic
attribute 0, or it emits a vertex just like glVertex().  I believe
it should do either, depending on context (see below).

The piglit gl-2.0-vertex-const-attr test declares two vertex attributes:
  attribute vec2 vertex;
  attribute vec4 attr;
and the GLSL linker assigns "vertex" to location 0 and "attr" to location 1.
The test passes.

But if the declarations were reversed such that "attr" was location 0 and
"vertex" was location 1, the test would fail to draw properly.

The problem is the call to glVertexAttrib(index=0) to set attr's value
was interpreted as glVertex() and did not set generic attribute[0]'s value.
Interesting, calling glVertex() outside glBegin/End (which is effectively
what the piglit test does) does not generate a GL error.

I believe the behavior of glVertexAttrib(index=0) should depend on
whether it's called inside or outside of glBegin/glEnd().  If inside
glBegin/End(), it should act like glVertex().  Else, it should behave
like glVertexAttrib(index > 0).  This seems to be what NVIDIA does.

This patch makes two changes:

1. Check if we're inside glBegin/End for glVertexAttrib()
2. Fix the vertex array binding for recalculate_input_bindings().  As it was,
   we were using &vbo->currval[VBO_ATTRIB_POS], but that's interpreted
   as a zero-stride attribute and doesn't make sense for array drawing.

No Piglit regressions.  Fixes updated gl-2.0-vertex-const-attr test and
passes new gl-2.0-vertex-attrib-0 test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101941
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-24 07:36:10 -06:00
Brian Paul
a7e65a443f gallivm: remove unused variable
Trivial.
2017-08-24 07:36:10 -06:00
Brian Paul
84f35c3423 st/mesa: add const qualifiers in st_extensions.c
Trivial.
2017-08-24 07:30:23 -06:00
Brian Paul
6ad313acf5 st/mesa: whitespace/indentation fixes in st_init_extensions() 2017-08-24 07:30:23 -06:00
Brian Paul
f883ede949 pipe-loader: use MAYBE_UNUSED to silence warning
Trivial.
2017-08-24 07:30:22 -06:00
Ilia Mirkin
96be442b77 nv50/ir: properly set sType for TXF ops to U32
All of the coordinates and LOD args are integers for TXF. This mostly
doesn't matter, except for converting into a levelZero=true operation by
removing an explicit zero LOD. For the comparison against zero to work
properly, the sType of the instruction has to be set correctly.

Fixes: KHR-GL45.robust_buffer_access_behavior.texel_fetch
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-24 08:41:57 -04:00
Samuel Pitoiset
bfef3fabc6 mesa: remove duplicate assignments in bind_xfb_buffers()
Useless to do that before checking errors. It's now similar to
the other bind_XXX_buffers() helpers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-24 11:01:50 +02:00
Samuel Pitoiset
f8b47b4789 mesa: fix debug/error messages in glColorMaski()
Trivial. While we are at it, adjust indentation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-24 11:01:47 +02:00
Timothy Arceri
4009370232 glsl: stop adding pointers from bindless structs to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
a6618afd27 glsl: stop adding pointers from shader_info to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
3ea3f75723 compiler: move pointers to the start of shader_info
This will allow us to easily skip them when writting the struct
to disk cache.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
44918a1979 glsl: always write a name/label string to the cache
In the following patch we will stop writing the pointer to cache.

Unfortunately adding empty strings to that cache seems to be the
only thing we can do here once we no longer have the pointers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
22154823d2 glsl: don't write uniform storage offset if there isn't one
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
2662269ad7 glsl: add has_uniform_storage() helper to shader cache
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
073a84ff60 glsl: stop adding pointers from glsl_struct_field to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
37d453b55a glsl: stop adding pointers from gl_shader_variable to the cache
This is so we always create reproducible cache entries. Consistency
is required for verification of any third party distributed shaders.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Timothy Arceri
37eb67714e glsl: allow NULL to be passed to encode_type_to_blob()
This will be used by the following commit.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-24 11:18:48 +10:00
Dave Airlie
8985ad494b radv/gfx9: don't expose linear depth on vega.
This just zeros out the linear flags for gfx9 + depth formats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:15 +01:00
Dave Airlie
5d26e0baf2 radv: don't degrade tiling mode for small compressed or depth texture.
This is what radeonsi does, so we should do the same, also vega
doesn't support linear depth textures anyways.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:15 +01:00
Dave Airlie
bae7723e13 radv/gfx9: only minify image view width/height/depth before gfx9.
For gfx9 the addressing for images has changed, so we need to
provide the hw with the level0, however we still need to scale
for format block differences (so our compressed upload paths still
work).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:15 +01:00
Dave Airlie
a74d987431 radv/image: don't rescale width/height if the format isn't changing
If the image view has the same format, we don't need to rescale
the w/h.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
5378b5d071 radv: cleanup some image view descriptor setup.
Avoid passing the vulkan image creation into the image view descriptor
setup. This cleans up the usage of range inside the init, instead
using the properly inited values in the image view.

This is just a cleanup but some future vega changes will depend on it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
9c080100d3 radv/gfx9: emit sx_mrt_blend registers
GFX9 needs the SX MRT blend registers programmed, port over
the code from radeonsi to workout the values from the blend
state, and program the registers on rbplus systems.

This fixes lots of:
dEQP-VK.pipeline.blend.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
864eb18527 radv: bump space check for indexed draw.
For the GFX9 packet we need one more dword.

Fixes an assert in:
dEQP-VK.draw.shader_draw_parameters.base_vertex.draw_indexed

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
d987b4ab9e radv/gfx9: fixup db/stencil disable.
This fixes disabled Z/stencil.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
11834195e9 radv/gfx9: fix level count in color register setup.
There was an off by one here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie
df09f1f3cd radv/gfx9: use total levels in texture descriptor
We need to use all the levels when filling out the gfx9
descriptor.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:13 +01:00
Bas Nieuwenhuizen
6bafb56df6 radv: Implement bc optimize.
Seems like we actually enabled it already, but did not implement
the shader part. With this patch we do.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 00:57:03 +02:00
Bas Nieuwenhuizen
a7f5545ede ac/nir: refactor input variable iteration.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 00:57:03 +02:00
Kenneth Graunke
4ffa9f3635 i965: Stop using wm_prog_data->binding_table.render_target_start.
Render target surfaces always start at binding table index 0.
This is required for us to use headerless FB writes, which we
really want to do.  So, we'll never change that.

Given that, it's not necessary to look up a wm_prog_data field
which we already know contains 0.  We can drop the dependency in
brw_renderbuffer_surfaces (Gen4-5)...which was already confusingly
missing from gen6_renderbuffer_surfaces.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
274afad4cd i965: Add a brw_wm_prog_data::has_render_target_reads field.
State upload code should use prog_data rather than poking at shader_info
directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
00b7d04181 i965: Inline brw_update_renderbuffer_surfaces().
Less baklava layers.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
7af023edc0 i965: Pass fb into emit_null_surface instead of dimensions.
We either want the framebuffer dimensions or 1x1x1.  Passing fb and
falling back to 1x1x1 lets us shorten some calls.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
e2dab867ac i965: Devirtualize update_renderbuffer_surface.
Replace piles of my own boilerplate with 1-2 lines of code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
081c54099c i965: Delete update_renderbuffer_surface flags.
We don't need yet another set of flags.  The function already has access
to both brw and the unit, so it can check brw->draw_aux_buffer_disabled
itself in one line of code.  The layered flag was only used to assert
that Gen4-5 doesn't do layered rendering, which isn't that useful.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
f70e0f52c9 i965: Make brw_update_renderbuffer_surface static.
Also rename it to gen6_update_renderbuffer_surface, as this is the
function for Gen6+.  Having functions named "brw_*" and "gen4_*"
is confusing...if we're using gens, let's stick with those.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
b96313c0e1 i965: Drop BRW_NEW_BLORP from SURFACE_STATE setup code.
BLORP invalidates the binding tables, but it doesn't destroy any of the
existing SURFACE_STATE entries in the statebuffer.  We can reuse those.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
54c41af0aa i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit.
When changing fast clear colors, we need to emit new SURFACE_STATE
with the updated color at the next draw call.

Most things work today because the atoms that handle SURFACE_STATE
for images (mutable images, textures, render targets) also listen to
BRW_NEW_BLORP, causing us to re-emit these on every BLORP operation.
However, this is overkill - most BLORP operations don't require us
to re-emit SURFACE_STATE.

One case where this is broken today is a fast clear to a different
color followed by a non-coherent framebuffer fetch.  The renderbuffer
read atom doesn't listen to BRW_NEW_BLORP, and would not get the new
fast clear color.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
d0b40e2c87 i965: Drop Gen7+ nonsense from brw_ff_gs.c.
brw_ff_gs.c is about using the geometry shader to implement things
that the fixed function ought to do, but doesn't on old hardware.

Gen7+ does not need this.  We should drop the misleading comment
about Gen7 not using geometry shaders.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
eaf5b8722b i965: Only set key->flat_shade if COL0/COL1 are written.
This may reduce some recompiles.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-23 11:55:17 -07:00
Kenneth Graunke
348929015b i965: Clean up brwNewProgram().
All shader stages do the exact same thing, so we don't need the switch
statement, or the redundant FS case.  I believe these used to be
different before Tim eliminated the (e.g.) brw_vertex_program
subclasses.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-23 11:55:17 -07:00
Leo Liu
5ff97f2644 st/va: exclude the buffer reallocation for encode case
Since encoder only support de-interlaced buffers.

v2: move to parameter call to tell dec/enc

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-23 14:51:12 -04:00
Tim Rowley
f0602dc920 swr: limit pipe_draw_info->restart_index usage
Only copy this value when in restart drawing mode.

Eliminates valgrind errors when running trivial programs.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-23 11:37:50 -05:00
Samuel Pitoiset
7fb4b6f270 radeonsi: fix wrong assertion in si_init_bindless_descriptors()
Bad mistake, sorry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-23 17:13:44 +02:00
Leo Liu
89f75c9483 radeon/video: Return false explicitly for HEVC if not the case
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-23 10:51:14 -04:00
Gwan-gyeong Mun
9649c6acce gallium/docs: Fix the math formula of U2I64
before:
  dst.xy = (uint64_t) src0.x
  dst.zw = (uint64_t) src0.y

after:
  dst.xy = (int64_t) src0.x
  dst.zw = (int64_t) src0.y

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-23 14:09:49 +02:00
Gwan-gyeong Mun
9aabf80ef3 gallium/docs: Add missing word "Not"
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-23 14:09:22 +02:00
Nicolai Hähnle
26996ec3b8 tgsi: store opcode mnemonics in a separate table
They are only used for debug info.

Together with making tgsi_opcode_info::opcode a bitfield, this reduces
the size of tgsi_opcode_info on 64-bit systems from 24 bytes to 4 bytes,
and makes the whole data structure a bit more linker friendly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:57 +02:00
Nicolai Hähnle
438177aa19 gallium: use tgsi_get_opcode_name instead of tgsi_opcode_info::mnemonic
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:55 +02:00
Nicolai Hähnle
2f7c55c23f tgsi: macro-ify the opcodes table
So we can easily re-arrange members of tgsi_opcode_info, and readers of
the code don't have to guess what all the 0s mean.

Mostly done with regex search&replace.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:53 +02:00
Nicolai Hähnle
48ef0a1ee4 tgsi: remove post_indent from some 64-bit opcodes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:51 +02:00
Nicolai Hähnle
3f433e927c tgsi: reduce tgsi_opcode_info::pre_dedent and post_indent to 1 bit
It's not clear why they were ever 2 bits to begin with. Perhaps
the original intent was to use signed values, but that doesn't
seem to have ever been the case in master.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:47 +02:00
Nicolai Hähnle
83c5d12d9d gallium/radeon: fix saving multi-part command streams
Use the correct type to fix pointer arithmetic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:09 +02:00
Nicolai Hähnle
8937ac9a13 ac/debug: invoke valgrind checks while parsing IBs
Help catch garbage data written into IBs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:07 +02:00
Nicolai Hähnle
c2c3912410 ac/debug: annotate IB dumps with the raw values
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:05 +02:00
Nicolai Hähnle
cfb3824c23 ac/debug: use an explicit getter for fetching words from the IB
Guard against out-of-bounds accesses, and prepare for upcoming changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:03 +02:00
Nicolai Hähnle
6fdd7ba32e radeonsi: update comment describing indices into sctx->descriptors
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:54:01 +02:00
Nicolai Hähnle
556946f801 util: fix valgrind errors when dumping pipe_draw_info
Various index-related fields are only initialized when required, so
they should only be dumped in those cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:53:54 +02:00
Samuel Pitoiset
94cc01105e radeonsi: do not assert when reserving bindless slot 0
When assertions were disabled, the compiler removed
the call to util_idalloc_alloc() and the first allocated
bindless slot was 0 which is invalid per the spec.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-23 13:38:56 +02:00
Samuel Pitoiset
f4ec41ecc4 radeonsi: rename some bindless-related helper functions
I think it makes more sense.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:37:07 +02:00
Samuel Pitoiset
9141d13214 radeonsi: minor cleanups in si_make_{texture,image}_handle_resident()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-23 13:37:05 +02:00
Rob Herring
f8e4223728 Android: gallium_dri: pass dri.sym to linker
Pass the dri.sym version script to the linker. This ensures only
explicitly exported symbols are exported and shrinks the library by up
to 60KB.

HAVE_DLADDR also needs to be set so that __driDriverExtensions is defined.

We need to pass "--undefined-version" because the Android build system
sets --no-undefined-version by default and we get an error on
driver specific symbols if those drivers are disabled without the option.

Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-22 19:02:12 -05:00
Leo Liu
2b025a11be st/va: enable P016 format i.e. reallocate buffer if format changed
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-22 15:13:42 -04:00
Leo Liu
398a299f7b radeon/vcn: enable P016 mode support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-22 15:13:34 -04:00
Leo Liu
df6c087a38 radeon/vcn: correct target buffer pitch calculation
since the way should be as same as UVD

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-22 15:12:19 -04:00
Francisco Jerez
e29ccaac29 anv: Check that in_fence fd is valid before closing it.
Probably harmless, but will overwrite errno with a failure status
code.  Reported by coverity.

CID 1416600: Argument cannot be negative (NEGATIVE_RETURNS)
Fixes: 5c4e4932e0 (anv: Implement support for exporting semaphores as FENCE_FD)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-22 11:56:38 -07:00
Francisco Jerez
7ca124a6a3 anv: Add error handling to setup_empty_execbuf().
The anv_execbuf_add_bo() call can actually fail in practice, which
should cause the QueueSubmit operation to fail.  Reported by Coverity.

CID: 1416606: Unchecked return value (CHECKED_RETURN)
Fixes: 017cdb10cf (anv: Submit a dummy batch when only semaphores are provided.)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-22 11:54:16 -07:00
Marek Olšák
4d807d7fe2 tgsi/scan: fix uses_double
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 18:11:28 +02:00
Marek Olšák
497506ad93 gallium: remove TGSI opcode SCS
use COS+SIN instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-22 16:42:17 +02:00
Marek Olšák
33efa6416f gallium/u_blitter: don't use boolean, TRUE, FALSE
v2: cherry-picked from the bigger patch series

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Brian Paul <brianp@vmware.com>
2017-08-22 15:21:19 +02:00
Marek Olšák
c7ad07758e gallium/u_simple_shaders: do util_make_layered_clear_vertex_shader differently
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-08-22 15:16:44 +02:00
Marek Olšák
8f75a6f1af gallium/u_blitter: remove get_next_surface_layer callback
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-08-22 15:16:44 +02:00
Samuel Pitoiset
e2f3cfead9 st/glsl_to_tgsi: fix getting the image type for array of structs (again)
We want the type of the field, not of the struct.

This fixes a regression in the following piglit test:
arb_bindless_texture/compiler/images/arrays-of-struct.frag

Fixes: 49d9286a3f ("glsl: stop copying struct and interface member names")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-22 13:58:51 +02:00
Marek Olšák
cdaaf66566 gallium: remove TGSI opcode BREAKC
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:33:48 +02:00
Marek Olšák
985e6b5ef9 gallium: remove TGSI opcode XPD
use MUL+MAD+MOV instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
3e2ff8fade gallium: remove TGSI opcode DPH
use DP4 or DP3 + ADD.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
86e6f7a73b gallium: remove TGSI opcode DP2A
use DP3 instead.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
0bb367830a gallium: remove TGSI_OPCODE_CALLNZ
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
068c3ad2cb gallium: remove TGSI FENCE opcodes
use MEMBAR instead

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
44716655e6 gallium: remove TGSI opcodes PUSHA, POPA, SAD, TXQ_LZ
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
8dadb07790 radeonsi: emit VGT_REUSE_OFF in the right place
clip_regs aren't marked dirty when writes_viewport_index is changed.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a6fed63f27 radeonsi: add support for TGSI opcodes DCEIL, DFLR, DROUND, DSSG, DTRUNC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
addd48194a radeonsi: use a faster version of PK2H
+ 4 piglit regressions, but it's correct accorcing to the GL spec and
performance is more important than piglit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
dc2ac03669 radeonsi: don't decompress Z/S if there is no HTILE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
e96259fabe gallium/radeon: add helpers for whether HTILE is enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
7dec48b81e radeonsi/gfx9: don't flush L2 metadata for DB if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
aa64e24cb1 radeonsi/gfx9: don't flush L2 metadata for CB if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
5b62eb237c radeonsi/gfx9: don't flush TC L2 between rendering and texturing if not needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
287b0a28f4 radeonsi/gfx9: use correct TC flush flags when invalidating CB & DB
Now we can finally stop flushing L2 data.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
759526813b ac/surface/gfx9: don't allow DCC for the smallest mipmap levels
This fixes garbage there if we don't flush TC L2 after rendering.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
54c2c771bd radeonsi/gfx9: don't use GS scenario A for VS writing ViewportIndex
Vulkan doesn't do it anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
776fcccabf gallium/radeon: clean up EOP_DATA_SEL magic numbers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a57f588fa9 radeonsi/gfx9: set 'not a query' for r600_gfx_write_event_eop correctly
0 is PIPE_QUERY_OCCLUSION_COUNTER, which is not what we want.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
a65afda768 radeonsi/gfx9: prevent shader-db crashes
- don't precompile LS and ES (they don't exist on GFX9), compile as VS instead
- don't precompile HS and GS (we don't have LS and ES parts)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
fdef2f0fd1 radeonsi/gfx9: properly handle imported textures with unexpected swizzle mode
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
113278ee79 radeonsi: remove Constant Engine support
We have come to the conclusion that it doesn't improve performance.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
166823bfd2 radeonsi/gfx9: add a temporary workaround for a tessellation driver bug
The workaround will do for now. The root cause is still unknown.

This fixes new piglit: 16in-1out

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Marek Olšák
248555ed2f glsl_to_tgsi: clean up opcode translation
An island of beauty in the middle of chaos.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-22 13:29:47 +02:00
Timothy Arceri
da28280544 mesa: pass ctx to add_uniform_to_shader constructor
Fixes: 4c2422067b ("glsl: pass UseSTD430AsDefaultPacking to where it will be used")

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-22 21:25:55 +10:00
Gwan-gyeong Mun
640b6e607c egl: deduplicate allocations of local buffer over each platform backend (v2)
platform_drm, platform_wayland and platform_android have similiar local buffer
allocation routines. For deduplicating, it unifies dri2_egl_surface's
local buffer allocation routines. And it polishes inconsistent indentations.

Note that as dri2_wl_get_buffers_with_format() have not make a __DRI_BUFFER_BACK_LEFT
attachment buffer for local_buffers, new helper function, dri2_egl_surface_free_local_buffers(),
will drop the __DRI_BUFFER_BACK_LEFT check.
So if other platforms use new helper functions, we have to ensure not to make
__DRI_BUFFER_BACK_LEFT attachment buffer for local_buffers.

v2: Fixes from Emil's review:
   a) Make local_buffers variable, dri2_egl_surface_alloc_local_buffer() and
      dri2_egl_surface_free_local_buffers() unconditionally.
   b) Preserve the original codeflow for error_path and normal_path.
   c) Add note on commit messages for dropping of __DRI_BUFFER_BACK_LEFT check.
   c) Rollback the unrelated whitespace changes.
   d) Add a missing blank line.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2017-08-22 12:10:15 +01:00
Samuel Pitoiset
46a8c4ef81 mesa: only expose EXT_memory_object functions if the ext is supported
They should not be exposed when the extension is unsupported.
Note that ARB_direct_state_access is always exposed and
EXT_semaphore is not supported at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:54:32 +02:00
Samuel Pitoiset
44cd9aeeec mesa: only expose glImportMemoryFdEXT if the ext is supported
From the EXT_external_objects_fd spec:

   "If the GL_EXT_memory_object_fd string is reported, the following
    commands are added:

    void ImportMemoryFdEXT(uint memory,
                           uint64 size,
                           enum handleType,
                           int fd);"

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:54:29 +02:00
Samuel Pitoiset
39a35eb0c1 radeonsi: try to re-use previously deleted bindless descriptor slots
Currently, when the array is full it is resized but it can grow
over and over because we don't try to re-use descriptor slots.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:37 +02:00
Samuel Pitoiset
c2dfa9b111 radeonsi: use slot indexes for bindless handles
Using VRAM address as bindless handles is not a good idea because
we have to use LLVMIntToPTr and the LLVM CSE pass can't optimize
because it has no information about the pointer.

Instead, use slots indexes like the existing descriptors. Note
that we use fixed 16-dword slots for both samplers and images.
This doesn't really matter because no real apps use image handles.

This improves performance with DOW3 by +7%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:29 +02:00
Samuel Pitoiset
50349f404d radeonsi: add si_emit_global_shader_pointers() helper
To share common code between rw buffers and bindless descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:24 +02:00
Samuel Pitoiset
a5ff4a8e2e radeonsi: only initialize dirty_mask when CE is used
Looks like it's useless to initialize that field when CE is
unused. This will also allow to declare more than 64 elements
for the array of bindless descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:23 +02:00
Samuel Pitoiset
a29ef75565 radeonsi: make some si_descriptors fields 32-bit
The number of bindless descriptors is dynamic and we definitely
have to support more than 256 slots.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:21 +02:00
Samuel Pitoiset
781a13c475 radeonsi: declare new user SGPR indices for bindless samplers/images
A new pair of user SGPR is needed for loading the bindless
descriptors from shaders. Because the descriptors are global for
all stages, there is no need to add separate indices for GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:15 +02:00
Samuel Pitoiset
e2793def40 gallium/util: add new module that allocate "numbers"
Will be used for allocating bindless descriptor slots for
RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:34:04 +02:00
Nicolai Hähnle
472c906d9f radeonsi/gfx9: add performance counters
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:55:16 +02:00
Nicolai Hähnle
e271607668 radeonsi: extract common code of si_upload_{graphics,compute}_shader_descriptors
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:55:05 +02:00
Nicolai Hähnle
a6e7693882 gallium: remove unused PIPE_DUMP_* defines
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:35 +02:00
Nicolai Hähnle
635a930ad3 ddebug: remove dd_draw_record::driver_state_log
It is no longer used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:35 +02:00
Nicolai Hähnle
f4c1d5a76d radeonsi: emit string markers to log context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:34 +02:00
Nicolai Hähnle
0c3f8aca7f radeonsi: log decompress blits
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:34 +02:00
Nicolai Hähnle
420c438589 radeonsi: log draw and compute state into log context
Also add missing trace emits and CS logging for compute launches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:34 +02:00
Nicolai Hähnle
4c3f36ec6b radeonsi: print saved CS to the log context
Use the auto logger facility, so that CS chunks will be interleaved
with other log info.

v2:
- fix some crashes when not using CE
- fix skipping "previous" chunks of current (unflushed) IB
- fix error handling in si_begin_cs_debug

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:53:14 +02:00
Nicolai Hähnle
bc93339799 radeonsi: start using u_log_context for debugging
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:51:00 +02:00
Nicolai Hähnle
ad33f2ddd8 radeonsi: re-order debug state dumping
Keep together the parts that won't use the deferred logging mechanism.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:57 +02:00
Nicolai Hähnle
40697e8678 radeonsi: make si_shader_selector_reference globally visible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:55 +02:00
Nicolai Hähnle
4bbf6ded20 radeonsi: add reference count to si_compute
To allow keep-alive for deferred logging.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:53 +02:00
Nicolai Hähnle
bbaad18c04 radeonsi: implement pipe_context::set_log_context
We'll add radeonsi-specific code to set_log_context in later patches,
but we may want to log from common code. Hence keep the log pointer
in r600_common_context.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:48 +02:00
Nicolai Hähnle
fbbb5f71cd amd/common: split out ac_parse_ib_chunk from ac_parse_ib
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:46 +02:00
Nicolai Hähnle
81d7577d48 ddebug: add driver log to record dumps
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:44 +02:00
Nicolai Hähnle
1966d9ff41 gallium: add pipe_context::set_log_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:42 +02:00
Nicolai Hähnle
177144cefc util/log: add auto logger facility
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:40 +02:00
Nicolai Hähnle
1cc2fd57d1 util: add chunk logging module
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 09:50:31 +02:00
Ian Romanick
b3a481779b glsl/linker: Make several functions not static
copy_constant_to_storage, set_uniform_initializer,
populate_consumer_input_sets, and get_matching_input are all used by
tests in src/compiler/glsl/tests:

glsl/tests/varyings_test.o: In function `link_varyings_single_simple_input_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:131: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o: In function `link_varyings_gl_ClipDistance_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:159: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o: In function `link_varyings_gl_CullDistance_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:186: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o: In function `link_varyings_single_interface_input_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:208: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o: In function `link_varyings_one_interface_and_one_simple_input_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:241: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o:src/compiler/glsl/tests/varyings_test.cpp:272: more undefined references to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)' follow
glsl/tests/varyings_test.o: In function `link_varyings_interface_field_doesnt_match_noninterface_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:289: undefined reference to `linker::get_matching_input(void*, ir_variable const*, hash_table*, hash_table*, ir_variable**)'
glsl/tests/varyings_test.o: In function `link_varyings_interface_field_doesnt_match_noninterface_vice_versa_Test::TestBody()':
src/compiler/glsl/tests/varyings_test.cpp:314: undefined reference to `linker::populate_consumer_input_sets(void*, exec_list*, hash_table*, hash_table*, ir_variable**)'
src/compiler/glsl/tests/varyings_test.cpp:328: undefined reference to `linker::get_matching_input(void*, ir_variable const*, hash_table*, hash_table*, ir_variable**)'

Fixes: ca73c3358c ("glsl: Mark functions static")

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-22 17:43:40 +10:00
Jason Ekstrand
0ae9ce0f29 i965/clear: Quantize the depth clear value based on the format
In f9fd976e8a we changed the clear value to be stored as an
isl_color_value.  This had the side-effect same clear value check is now
happening directly between the f32[0] field of the isl_color_value and
ctx->Depth.Clear.  This isn't what we want for two reasons.  One is that
the comparison happens in floating point even for Z16 and Z24 formats.
Worse than that, ctx->Depth.Clear is a double so, even for 32-bit float
formats, we were comparing as doubles and not floats.  This means that
the test basically always fails for anything other than 0.0f and 1.0f.
This caused a slight performance regression in Lightsmark 2008 because
it was using a depth clear value of 0.999 which can't be stored in a
32-bit float so we were doing unneeded resolves.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/101678
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-21 22:18:53 -07:00
Timothy Arceri
3c9ed70d92 mesa/st: simplify some UBO index logic
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 13:32:21 +10:00
Timothy Arceri
36431cf979 i965: enable STD430 packing by default on IVB+
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-22 11:29:27 +10:00
Timothy Arceri
4c2422067b glsl: pass UseSTD430AsDefaultPacking to where it will be used
Here we also make use of the UseSTD430AsDefaultPacking constant
and call the new get_internal_ifc_packing() helper.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:29:27 +10:00
Timothy Arceri
12e1f0c696 glsl: add get_internal_ifc_packing() type helper
This is used to avoid code duplication when selecting the
packing type for shared and packed layouts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:29:27 +10:00
Timothy Arceri
334a27afa7 mesa: add UseSTD430AsDefaultPacking constant
This will be used to enable the STD430 layout as the default for
UBOs and SSBOs with layouts of shared/packed rather than STD140.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 11:29:27 +10:00
Aaron Watry
5e253fe338 clover/device: Calculate CL_DEVICE_MEM_BASE_ADDR_ALIGN in device
The CL CTS queries CL_DEVICE_MEM_BASE_ADDR_ALIGN for a device and
then allocates user pointers aligned to that value for its tests.

The minimum value is defined as:
  the size (in bits) of the largest OpenCL built-in data type supported
  by the device (long16 in FULL profile, long16 or int16 in EMBEDDED
  profile) for devices that are not of type CL_DEVICE_TYPE_CUSTOM.

At the moment, all known devices that support user pointers require
CPU page alignment for buffers created from user pointers, so just
query that from sysconf.

v3: Use std::max instead of MAX2 (Francisco)
    Add missing unistd include
v2: Use system page size instead of a new pipe cap

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by (v2): Jan Vesely <jan.vesely@rutgers.edu>
2017-08-21 20:21:52 -05:00
Brian Paul
19e9bd4c11 mesa: optimize _mesa_attr_zero_aliases_vertex()
After the context is initialized, the API and context flags won't
change.  So, we can compute whether vertex attribute 0 aliases
vertex position just once.

This should make the glVertexAttrib*() functions a little quicker.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-21 19:04:51 -06:00
Brian Paul
0ef5aa4128 vbo: use new _is_vertex_position() helper in vbo_attrib_tmp.h
Makes the code a bit more understandable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-21 19:04:51 -06:00
Brian Paul
1850256172 vbo: make vbo_bind_arrays() static
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-21 19:04:51 -06:00
Brian Paul
4d2b21a326 svga: replace gotos with conditionals in array drawing code
No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-21 19:04:51 -06:00
Brian Paul
d50b8b91d7 llvmpipe: add some whitespace between functions in lp_texture.c
Trivial.
2017-08-21 19:04:51 -06:00
Brian Paul
84509779a9 mesa: formatting clean-up in syncobj.c
Line wrap to 78 columns, etc.  Trivial.
2017-08-21 19:04:51 -06:00
Brian Paul
196a0b28a0 svga: whitespace clean-up in svga_draw_private.h
Trivial.
2017-08-21 19:04:51 -06:00
Timothy Arceri
f37824ff4a docs: remove link to MissingFunctionality wiki page
Outdated, features.txt is used instead.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
3025f89dd8 docs: remove MSVC testing/building from help wanted
We are using appveyor for Windows continuous integration.

https://ci.appveyor.com/project/mesa3d/mesa

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
33d1deb9c8 docs: remove automatic testing from help wanted
Intel has a Jenkins setup and has made the various scripts and
documentation open source.

https://github.com/janesma/mesa_jenkins

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
3fdd676898 docs: rename TODOs to Legacy Driver TODOs
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
177087de18 docs: remove link to i915g TODOs
This is an unoffical unmaintained driver, we don't really want
people wasting effort trying to improve it.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
4b82ec8c68 docs: remove link to radeonsi TODO wiki page
This page is deprecated.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
6fceace7bf gallium/docs: remove old llvmpipe TODO
Features are already covered by features.txt like all the other
drivers.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-22 11:03:08 +10:00
Timothy Arceri
a4635c84dc mesa: fix ES only draw if we have vertex positions
This code was separated from the validation code so it could
use used with KHR_no_error paths. The return values were inverted
to reflect the name of the helper, but here the condtion was
mistakenly inverted rather than the return value.

Fixes: 4df2931a87 (mesa/vbo: move some Draw checks out of validation)

Reported-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-22 10:42:18 +10:00
Matt Turner
91b8d874da glsl: Add prototype for udivmod64()
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
ca73c3358c glsl: Mark functions static
Cuts 3224 bytes of .text

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
d37d9f84ac i965: Mark functions static
Cuts 300 bytes of .text

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
f30902629c i965/vec4: Use 'class' src_reg, rather than 'struct' src_reg
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
a77d5b28ac i965/vec4: Return float from spill_cost_for_type()
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-08-21 14:45:44 -07:00
Matt Turner
76f36607b0 anv: Move clamp_int64() inside the IVB check
It's only used in the gen7_cmd_buffer_emit_scissor() function.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
ee2f7aa03b glsl: Remove unused private fields
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
384e27174d mesa: Don't compare unsigned for < 0
The INTEL_performance_query spec says

   "Performance counter id 0 is reserved as an invalid counter."

GLuint counterid_to_index(GLuint counterid) just returns counterid - 1,
so with unsigned overflow rules, it will generate 0xFFFFFFFF given an
input of 0. 0xFFFFFFFF will trigger the counterIndex >= queryNumCounters
check, so the code worked as is. It just contained a useless comparison.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Matt Turner
4e97084591 egl: Fix inclusion of egl.h+mesa_glinterop.h
Previously clang would warn about redefinition of typedef EGLDisplay. Avoid
this by adding preprocessor guards to mesa_glinterop.h and including it
after EGL.h is indirectly included.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-21 14:45:44 -07:00
Marek Olšák
db039d67aa radeonsi: don't prefetch VBO descriptors if vertex elements == NULL
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-21 23:06:42 +02:00
Marek Olšák
ea1b97714d r600g: don't set up and don't call the fetch shader if there are no VS inputs 2017-08-21 23:06:42 +02:00
Matt Turner
a98b1a8922 i965: Optimize reading the destination type
brw_hw_type_to_reg_type() needs to know only whether the file is
BRW_IMMEDIATE_VALUE or not, which is not a valid file for the
destination. gcc and clang will evaluate __builtin_strcmp() at compile
time, so we can use it to pass a constant file for the destination.

   text	   data	    bss	    dec	    hex	filename
7816214	 346248	 420496	8582958	 82f72e	i965_dri.so before
7816070	 346248	 420496	8582814	 82f69e	i965_dri.so after

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
91ef949054 i965: Mark brw_hw_type_to_reg_type() as a pure function
text	   data	    bss	    dec	    hex	filename
7816886	 346248	 420496	8583630	 82f9ce	i965_dri.so before
7816214	 346248	 420496	8582958	 82f72e	i965_dri.so after

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
e07fe89035 i965: Hide the register type hardware encodings
So we stop mixing them with the logical enum.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
4fab67a441 i965: Stop using hardware register types directly
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
c746f1c888 i965: Add brw_hw_reg_type_to_letters() and use it in brw_disasm.c
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
6a2471b501 i965: Move brw_reg_type_letters() as well
And add "to_" to the name for consistency with the other functions in
this file.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
1cb0a7941b i965: Switch to using the logical register types
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
cb2cd462b1 i965: Add functions to abstract access to register types
Previously the brw_inst{,_set}_{dst,src0,src1}_reg_type() functions
provided access to the hardware encodings for the register types. We
often mixed these with the logical BRW_REGISTER_TYPE_* enums (which
themselves used to be the hardware format!) with bad results.

With that functionality now available with the hw_ versions (see
previous commit), we now add functions that take the logical
BRW_REGISTER_TYPE_* enums and convert into the hardware format and vice
versa. To do the conversion we also have to provide the file.

Note the asymmetry between the two functions: the new getter reads the
file from the instruction word, and to ensure that is always set the
setter writes both the file and the type.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
9fb8323328 i965: Rename brw_inst's functions that access the register type
Put hw_ in the name so that it's clear these are the hardware encodings.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
3e379af492 i965: Index brw_hw_reg_type_to_size()'s table by logical type
I'll be transitioning everything to use the logical types.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
c1ac1a3d25 i965: Add a brw_hw_type_to_reg_type() function
Will be used in later commits.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
dbe7dd13dd i965: Use a common table to translate logical to hardware types
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
bfcc9aa829 i965: Extract functions dealing with register types to separate file
I'm going to encapsulate all of the logic dealing with register types in
this file.

Rename the parameters for the hardware encodings from type -> hw_type at
the same time.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
890f863da0 i965: Reverse file/type arguments to register type functions
I think of the initial arguments as "state" and the last as the actual
subject.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
92f787ff86 i965: Add support for disassembling 64-bit integer immediates
After the last patch converted things into enums, I helpfully got a
compiler warning about these missing from the switch statement.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
deae25ce37 i965: Use separate enums for register vs immediate types
The hardware encodings often mean different things depending on whether
the source is an immediate.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
8815b9677f i965: Reorder brw_reg_type enum values
These vaguely corresponded to the hardware encodings, but that is purely
historical at this point. Reorder them so we stop making things "almost
work" when mixing enums.

The ordering has been closen so that no enum value is the same as a
compatible hardware encoding.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
ce6b8627d8 i965: Validate destination restrictions with vector immediates
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
1d79c828d8 i965: Don't let raw-move check be tricked by immediate vector types
UB and B type encodings are the same as UV and VF. Noticed when writing
the following patch.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
48aa6ecb87 i965: Only change type of 0.0f to VF if destination stride == 1
The destination stride must be equivalent to a dword if VF is used.

Also, since the only compaction table entires with "i:vf" have the
destination as "r:f" specifically check that the destination is of type
float.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
56a676eed2 i965: Remove CONT/BREAK from instruction compaction test
These cannot be compacted. A similar mistake was fixed in commit
90eaf01616

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
3d661e6062 i965: Test instruction compaction on all supported Gens
Note that there's no point in testing on G45, since its compaction is
the same as Gen5. Same logic applies to Gen7 variants and low-power
parts.

Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
9ff7d9b853 i965: Silence signed/unsigned comparison warning
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
eac89911e5 i965: Move compaction "prepass" into brw_eu_compact.c
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Matt Turner
17641f6388 i965: Mark src inst pointer const in compaction code
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
2017-08-21 14:05:23 -07:00
Dave Airlie
b3f87b87f6 vulkan: import 1.0.59 headers and xml.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-22 07:00:50 +10:00
Rob Herring
4734bfc02a Android: Fix LLVM duplicated symbols linking for N and M
Both statically linking libLLVMCore and dynamically linking libLLVM causes
duplicated symbols in gallium_dri.so and it fails to dlopen. We don't
really need to link libLLVMCore, but just need generated headers to be
built first. Dynamically linking to libLLVM instead is enough to do
that. Thanks to Qiang Yu for finding the root cause.

With this change, we can align all versions and just have libLLVM as a
shared lib dependency.

This also requires changes in the M and N versions of LLVM to export the
include paths for libLLVM. AOSP master is okay.

Fixes: 26aee6f4d5 ("Android: rework LLVM build support")
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Qiang Yu <Qiang.Yu@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-21 10:46:21 -05:00
Andres Gomez
ca7e31fd07 docs: update calendar, add news item and link release notes for 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:34:42 +03:00
Andres Gomez
79bcc1eb40 docs: add sha256 checksums for 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:24:40 +03:00
Andres Gomez
862f35905c docs: add release notes for 17.1.7
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-08-21 18:24:38 +03:00
Leo Liu
03b89547b7 st/va: add MJPEG for config
To enable MJPEG HW decode

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
5608f44271 st/va: reallocate surface with YUYV stream
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
2ebc530ca3 st/va: detect MJPEG format from bitstream
To find if the format is supported YUYV by sampling factor which
is embedded from bitstream. So we could use this info for buffer
reallocation on the correct format.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
7319ff8787 radeon/uvd: add YUYV format support for target buffer
Make chroma plane optional for YUYV support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
c4061bb5fa st/va: reallocate surface when interlaced
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
fceb52a230 radeon/video: MJPEG not support stacked video buffers
So we have to detect it for reallocation of de-interlaced buffers

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
e50ee6d4d5 st/va: make surface allocate functions more usefully
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
130d1f456b radeon/uvd: reconstruct MJPEG bitstream
The current tier 1 mjpeg firmware only supports at the bitstream
level, the later tier 2 support will be at the buffers level with
newer hardware.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
ef099e6799 st/va: add slice parameter handling for MJPEG
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
8e9175744e st/va: add huffman table handling for MJPEG
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
93577e6081 st/va: add iq matrix handling for MJPEG
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
535b3c2363 st/va: add picture parameter handling for MJPEG
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
41f17eb5f0 st/va: add handles for MJPEG Buffers
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
38b9686df0 st/va: create decoder for MJPEG format
Mjpeg doesn't need reference

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
0a59477372 st/va: add MJPEG picture to context
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
15f3335577 radeon/video: add MJPEG support
v2: add ASIC and Kernel version check

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
3fe713ce3d radeon/uvd: add MJPEG support
There is no need of dpb buffer for mjpeg codec

v2: check dpb_size instead of format

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
b26cfdaebd radeon/uvd: add MJPEG stream type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
4ac38ac3de vl: add MJPEG picture description
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
11ccb56e9f vl: add MJPEG profile and format
v2: move util video change to here

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Leo Liu
2b1eacabfa radeon/uvd: get the target buffer pitch correct for different format
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-08-21 10:09:09 -04:00
Samuel Pitoiset
2843c5d15c radeonsi: update non-resident bindless descriptors if needed
Only resident bindless descriptors are currently updated and
re-uploaded, this makes sure that the non-resident ones are
also updated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-21 15:23:56 +02:00
Louis-Francis Ratté-Boulianne
498814a3ca dri3: Move up fourcc utility function
It will be needed in next patches.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-21 12:55:54 +01:00
Daniel Stone
85ef0215dd egl: Add dma_buf_import_modifiers for glvnd
Make sure we advertise the new entrypoints to libglvnd's EGL dispatch.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reported-by: Emmanuel Gil Peyrot <emmanuel.peyrot@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101982
Fixes: 4c412293d0 ("egl: advertise EGL_EXT_image_dma_buf_import_modifiers")
2017-08-21 12:13:50 +01:00
Topi Pohjolainen
393ec1a507 intel/blorp: Adjust intra-tile x when faking rgb with red-only
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-08-21 09:55:08 +03:00
Dave Airlie
b040f51b61 ac/nir: fixup layer/viewport export for GFX9.
GFX9 moved where the viewport index export goes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-21 04:26:37 +01:00
Jason Ekstrand
c366943ebf i965/bufmgr: s/BO_ALLOC_FOR_RENDER/BO_ALLOC_BUSY/
"Alloc for render" is a terrible name for a flag because it means
basically nothing.  What the flag really does is allocate a busy BO
which someone theorized at one point in time would be more efficient if
you're planning to immediately render to it.  If the flag really means
"alloc a busy BO" we should just call it that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-20 20:14:49 -07:00
Jason Ekstrand
cadcd89278 i965/tex: Change the flags type on create_for_teximage
This matches the actual function declaration.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-20 20:14:49 -07:00
Christoph Haag
87556a650a mesa: only copy requested compressed teximage cubemap faces
This is analogous to commit 2259b11 which only fixed the regular case

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102308
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-20 17:01:48 -04:00
Jason Ekstrand
f24cf82d6d i965/tex: Don't pass samples to miptree_create_for_teximage
In 76e2f390f9, when Topi switched num_samples from 0 to 1 for
single-sampled, he accidentally switched the last parameter in the call
to miptree_create_for_teximage from 0 to 1 thinking it was num_samples
when it was actually layout_flags.  Switching from 0 to 1 added the
MIPTREE_LAYOUT_ACCELERATED_UPLOAD flag which causes us to allocate a
busy BO instead of an idle one.  This caused the subsequent CPU upload
to consistently stall.  The end result was a 15% performance drop in the
SynMark v7 DrvRes microbenchmark.  This restores the old behavior and
fixes the performance regression.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Fixes: 76e2f390f9
Bugzilla: https://bugs.freedesktop.org/102260
Cc: mesa-stable@lists.freedesktop.org
2017-08-19 15:39:12 -07:00
Kenneth Graunke
6f8a577ed2 anv: Use ISL for emitting null surface states.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-19 00:46:48 -07:00
Kenneth Graunke
5ae983c85b i965: Use ISL for emitting null surface states.
We handle the Sandybridge multisampled 2D surface hack here, rather
than in ISL, because it requires allocating a BO, and is kind of messy.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-19 00:46:46 -07:00
Kenneth Graunke
5db9757bd7 isl: Add a null surface fill function.
ISL already offers functions to fill out most kinds of SURFACE_STATE,
so why not handle null surfaces too?

Null surfaces are simple, so we can just take the dimensions, rather
than an entirte fill structure.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-19 00:46:36 -07:00
Kenneth Graunke
288621b1b7 i965: Remove tabs in intel_batchbuffer.c.
Our coding style is to use spaces.  Some of this was also messed up
during my bufmgr import series.

(Trivial, just whitespace changes.)
2017-08-18 23:51:56 -07:00
Jason Ekstrand
61d2f3f1c2 i965/miptree: Return NONE from texture_aux_usage when fully resolved
This little optimization improves the performance of SynMark v7
TexFilterTri by almost 10% on Sky Lake GT4 among other improvements.
We've been doing it for some time but somehow it got dropped during
the miptree refactoring.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/102258
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-18 17:31:02 -07:00
Jason Ekstrand
d5e217dbfd i965: Stop looking at NewDriverState when emitting 3DSTATE_URB
Looking at NewDriverState is not safe in general.  The state atom system
is set up to ensure that new bits that get added to NewDriverState get
accumulated into the set of bits used when emitting atoms but it doesn't
go the other way.  If we read NewDriverState, we may not get the full
picture because the per-pipeline state (3D or compute) does not get
added to NewDriverState before state emit is done.  It's especially
dangerous to do this from BLORP (either explicitly or implicitly when
BLORP calls gen7_upload_urb) because that does not happen during one of
the normal state upload paths.

This commit solves the problem by whacking all of the per-shader-stage
URB sizes to zero whenever we change the total URB size.  We still have
to flag BRW_NEW_URB_SIZE to ensure that the gen7_urb atom triggers but
the actual decision in gen7_upload_urb can now be based entirely on URB
sizes rather than on state atoms.  This also makes BLORP correct because
it just asks for a new URB config whenever the vsize is too small and so
any change to the total URB size will trigger blorp to re-emit as well
because 0 < vs_entry_size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Bugzilla: https://bugs.freedesktop.org/102289
Cc: mesa-stable@lists.freedesktop.org
2017-08-18 17:30:55 -07:00
Kenneth Graunke
bc56dfbf3f i965: Mark all EGLimages as non-coherent.
EGLimages are shared with external users, and we don't know what they're
going to do with them.  They might scan them out.  They might access
them in a way that doesn't work with our explicit clflushing.

It's safest to simply mark them non-coherent.

Chris Wilson caught this problem and wrote a similar (though less
aggressive) patch to solve it; the miptree code has since undergone
a lot of refactoring so I had to rewrite it.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-18 16:28:13 -07:00
Eric Anholt
a727e03360 broadcom/genxml: Add V3D 3.3 packet definitions.
This will be used by the new vc5 gallium driver, and a future Vulkan
driver.
2017-08-18 12:54:13 -07:00
Eric Anholt
7c576d6091 broadcom/genxml: Check the sub-id field when decoding instructions.
VC5 introduces packet variants where the same opcode has behavior that is
decided by a sub-id field in the early bits of the packet.  Keep iterating
over packets until we find the one with the matching sub-id.
2017-08-18 11:56:58 -07:00
Eric Anholt
14fe9fd3f7 broadcom/genxml: Emit code for default headers for structs as well.
In the vc5 NIR backend, I want to use the XML code-generation to set up
pack/unpack of structs for the texture uniforms, and setting up the
unpacked copy needs a default header.
2017-08-18 11:56:58 -07:00
Eric Anholt
9caba0f16f anv: Move a comment that got left behind in the u_vector refactor. 2017-08-18 11:56:58 -07:00
Marek Olšák
57fb1bb585 gallium/radeon: remove old_fence parameter from r600_gfx_write_event_eop
just use the new scratch buffer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 16:06:21 +02:00
Marek Olšák
41e053954d radeonsi/gfx9: prevent a GPU hang after a timestamp event
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 16:06:18 +02:00
Marek Olšák
13aa8d3da9 radeonsi: don't use CLEAR_STATE on SI
This fixes random hangs with Unigine Valley.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102201

Fixes: 064550238e ("radeonsi: use CLEAR_STATE to initialize some registers")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 15:59:22 +02:00
Jon Turney
5ee159e4b3 Fix build when HAVE_LIBDRM isn't defined
make[4]: Entering directory '/wip/mesa/build/src/gallium/targets/dri'
  CXXLD    gallium_dri.la
../../../../src/gallium/auxiliary/pipe-loader/.libs/libpipe_loader_static.a(libpipe_loader_static_la-pipe_loader.o): In function `pipe_loader_get_driinfo_xml':
/mesa/build/src/gallium/auxiliary/pipe-loader/../../../../../src/gallium/auxiliary/pipe-loader/pipe_loader.c:117: undefined reference to `pipe_loader_drm_get_driinfo_xml'

b4ff5e90 uses pipe_loader_get_driinfo_xml() unconditionally in
pipe_loader.c, but it's definition in pipe_loader_get_driinfo_xml() is only
built if HAVE_LIBDRM.

Arrange to always use the default XML if HAVE_LIBDRM isn't defined.

Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-18 15:08:00 +02:00
Kenneth Graunke
5af7f1ccec i965: Fix missing newlines in perf_debug messages.
perf_debug() doesn't append a newline for you.
2017-08-17 23:42:49 -07:00
Ilia Mirkin
9c8f017f77 glsl: add a few missing int64 constant propagation cases
Fixes KHR-GL45.shader_ballot_tests.ShaderBallotAvailability, which
causes some silly swizzles to appear, triggering this optimization to
get hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-18 02:26:16 -04:00
Timothy Arceri
c03eefdf84 glsl: set old ldexp operand to NULL when lowering
This fixes an assert during IR validation in LLVMpipe.

Fixes: e2e2c5abd2 (glsl: calculate number of operands in an expression once)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102274
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2017-08-18 12:07:34 +10:00
Jason Ekstrand
1af8342b0c intel/isl: Replace switch statements of doom with a macro
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-17 18:09:05 -07:00
Jason Ekstrand
2d68d27071 intel/isl: Reduce header file duplication
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-17 18:09:05 -07:00
Dave Airlie
611076a41a radv: disable support for VEGA for now.
I'm working on this, but I'm not sure I'll make 17.2 at this stage,
maybe 17.2.1.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-18 00:49:48 +01:00
Jeremy Huddleston Sequoia
c1c4c18a80 glxcmds: Fix a typo in the __APPLE__ codepath
s/DummyContext/dummyContext/

Regressed-in: 5d9b50e596
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2017-08-17 15:13:33 -07:00
Roland Scheidegger
3e96231457 llvmpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW
The driver supported this since way before the GL spec for it existed.
Just need to support both the per-stream and for all streams variants
(which are identical due to only supporting 1 stream).
Passes piglit arb_transform_feedback_overflow_query-basic.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-17 18:46:44 +02:00
Roland Scheidegger
26d46b94b4 softpipe: enable PIPE_CAP_QUERY_SO_OVERFLOW
The driver was supposed to support this since way before the GL spec for it
existed, albeit it was apparently broken, so fix and enable it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-17 18:46:44 +02:00
Gwan-gyeong Mun
c87594575b dri: fix typo in comment
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-17 09:58:49 +01:00
Vinson Lee
a0ed82947c configure.ac: Check for expat21 if expat is not found.
Fixes build error on CentOS 6.9.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102052
Fixes: 5c007203b7 ("configure.ac: drop manual detection of expat header/library")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-08-17 00:11:27 -07:00
Michel Dänzer
2c5717a4de configure: Check llvm-config --shared-mode
https://bugs.llvm.org/show_bug.cgi?id=6823 still affects current LLVM.
llvm-config --libs only reports the single shared library if LLVM was
built with -DLLVM_LINK_LLVM_DYLIB=ON. llvm-config --shared-mode reports
"shared" in that case, "static" otherwise (even if LLVM was built with
-DLLVM_BUILD_LLVM_DYLIB=ON).

v2: Keep the LLVM < 4.0 test. (llvm-config --shared-mode is actually
    available since LLVM 3.8, but that would make the test too
    complicated :)

Fixes: 3d8da1f678 ("configure: Trust LLVM >= 4.0 llvm-config --libs
                      for shared libraries")
Bugzilla: https://bugs.freedesktop.org/102247
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-08-17 15:13:09 +09:00
Thomas Hellstrom
0cc4c7e33e loader_dri3: Make sure we have an updated back v3
With GLX_SWAP_COPY_OML and GLX_SWAP_EXCHANGE_OML it may happen in situations
when glXSwapBuffers() is immediately followed by for example another
glXSwapBuffers() or glXCopyBuffers() or back buffer age querying, that we
haven't yet allocated and initialized a new back buffer because there was
no GL rendering in between.

Make sure that we have a back buffer in those situations.

v2: Eliminate the drawable have_back_format member.
v3: Make sure we re-initialize the back even if it exists.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
7c3e3c0faf loader_dri3: Support GLX_SWAP_EXCHANGE_OML
Add support for the exchange swap method. Since we're now forcing a fake front
buffer and we exchange the back and fake front on swaps, we don't need to add
much code.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
c898e02a33 loader_dri3: Eliminate the back-to-fake-front copy
Eliminate the back-to-fake-front copy by exchanging the previous back buffer
and the fake front buffer. This is a gain except when we need to preserve
the back buffer content but in that case we still typically gain by replacing
a server-side blit by a client side non-flushing blit.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
74b4cdd80a loader_dri3: Remove buffer_type from buffer metadata
It's not used anywhere and now that we're about to exchange back- and
fake fronts it doesn't serve a purpose.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
16d1a0bcdb loader_dri3: Support GLX_SWAP_COPY_OML
Support the GLX_SWAP_COPY_OML method. When this method is requested, we use
the same swapbuffer code path as EGL_BUFFER_PRESERVED.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
1e83baeb4b loader_dri3: Honor the request to preserve back buffer content
EGL uses the force_copy parameter to loader_dri3_swap_buffers_msc() to indicate
that it wants to preserve back buffer contents across a buffer swap.

While the loader then turns off server-side page-flipping there's nothing to
guarantee that a new backbuffer isn't chosen when EGL starts to render again,
and that buffer's content is of course undefined.

So rework the functionality:
If the client supports local blits, allow server-side page flipping and when
a new back is grabbed, if needed, blit the old back's content to the new back.
If the client doesn't support local blits, disallow server-side page-flipping
to avoid a client deadlock and then, when grabbing a new back buffer, sleep
until the old back is idle, which may take a substantial time depending on
swap interval.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
f71e174bb8 loader_dri3: Increase the likelyhood of reusing the current swap buffer
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
2db9548296 loader_dri3/glx/egl: Optionally use a blit context for blitting operations
The code was relying on us always having a current context for client local
image blit operations. Otherwise the blit would be skipped. However,
glxSwapBuffers, for example, doesn't require a current context and that was a
common problem in the dri1 era. It seems the problem has resurfaced with dri3.

If we don't have a current context when we want to blit, try creating a private
dri context and maintain a context cache of a single context.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Thomas Hellstrom
5198e48a0d loader_dri3/glx/egl: Remove the loader_dri3_vtable get_dri_screen callback
It's not very usable since in the rare, but definitely existing case that
we don't have a current context, it will return NULL.

Presumably it will always be safe to use the dri screen the drawable was
created with for operations on that drawable.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-17 07:39:42 +02:00
Ilia Mirkin
934511d1f3 nv50/ir: fix TXQ srcMask
src0.x is always read for the LOD, irrespective of which outputs are
read.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-16 22:39:22 -04:00
Ilia Mirkin
054c54d1be nv50/ir: fix srcMask computation for TG4 and TXF
This affects which inputs are marked as used. In a situation where only
the texture instruction uses an input, it might have been ignored as
unused due to input masks.

Affects subtests of KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-16 22:39:21 -04:00
Jason Ekstrand
bf1d2e84f3 anv/gem: Add a stub for sync_file_merge
This fixes make check

Fixes: 5c4e4932e0
2017-08-16 18:44:26 -07:00
Dave Airlie
4c02e2bd95 radv: disable texture gather workaround on gfx9.
Not required anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-17 02:24:36 +01:00
Brian Paul
3ab0c25939 st/mesa: remove Windows hack for glFinish
I see no evidence that opengl32.dll's wglSwapBuffers calls glFinish.
It looks like Jose removed that dependency years ago, but this hack
remained.

Removing this code also fixes the Piglit sync_api test since commit
eceb671002.

No piglit regressions.  No glretrace regressions, per Charmaine.
Fixes VMware bug 1937990.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-08-16 19:03:10 -06:00
Frank Richter
7fb7287ce7 gallium/os: fix os_time_get_nano() to roll over less
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-16 18:32:47 -06:00
Frank Richter
d90e05ad48 st/wgl: check for negative delta in wait_swap_interval()
This can happen because of rollover.  See bug report for details.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102241
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-08-16 18:32:46 -06:00
Frank Richter
496a691e35 st/mesa: fix a null pointer access
Fixes crash with llvmpipe on Windows.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102148
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-08-16 18:32:41 -06:00
Kenneth Graunke
27fb0899f7 i965: Alphabetize TCS image dirty bits
Trivial.
2017-08-16 16:09:29 -07:00
Chris Wilson
49eda75df6 i965: Always allow CPU readback of the scanout on LLC platforms
LLC platforms are magic in that reads from the CPU are always cache
coherent, or rather GPU writes that bypass LLC do still invalidate the
appropriate cache line.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-16 12:25:02 -07:00
Tim Rowley
b333bc753e swr/rast: Fix invalid casting for calls to Interlocked* functions
CID: 1416243, 1416244, 1416255
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-16 14:20:22 -05:00
Boyuan Zhang
a44b334e48 radeon/vce: support all firmwares with major ver 53
The vce firmware interface should now be stable, all firmwares with
major version equals to 53 are supported.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig at amd.com>
2017-08-16 14:42:41 -04:00
Tapani Pälli
733422e53c i965: make sure check_and_emit_atom gets inlined
Improves performance of 3DMark "Ice Storm Unlimited" benchmark
by 1-2% on Apollolake (on Android-IA using clang 3.8.256229).

Change is based on the performance profiling work and results
by Aravindan Muthukumar and Yogesh Marathe.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Reviewed-by: Scott D Phillips <scott.d.phillips@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-16 12:32:32 +03:00
Ilia Mirkin
f96f210239 a2xx: only update rasterizer settings when they're there
The rasterizer being empty can happen e.g. during clears

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-15 22:54:40 -04:00
Ilia Mirkin
08f72a8944 a2xx: add logicop support
This passes both gl-1.0-logicop and gl-1.1-xor piglits.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-15 22:54:40 -04:00
Ilia Mirkin
978c4c597a glsl/ast: update rhs in addition to the var's constant_value
We continue in the code to do some more things with the rhs, including
setting a constant initializer. If the type is wrong, this causes some
confusion down the line, leading to assertions. This makes sure that the
rhs processing continues to flow as-if the type was correct to start
with (even though the state has been marked as an error state).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101766
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-08-15 22:14:05 -04:00
Jason Ekstrand
98983503cb anv: Advertise VK_KHR_external_semaphore
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
55bce22d8d anv: Use DRM sync objects for external semaphores when available
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
f41a0e4b0d anv/gem: Add a drm syncobj support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
eb4564bf93 intel/drm: Pull in the i915 fence array API
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
5c4e4932e0 anv: Implement support for exporting semaphores as FENCE_FD
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
e4054ab77b anv/gem: Use EXECBUFFER2_WR when the FENCE_OUT flag is set
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
017cdb10cf anv: Submit a dummy batch when only semaphores are provided.
Vulkan allows you to do a submit whose only job is to wait on and
trigger semaphores.  The easiest way for us to support that right
now is to insert a dummy execbuf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Jason Ekstrand
031f57eba3 anv: Add a basic implementation of VK_KHX_external_semaphore
This patch adds an implementation based on DRM BOs.  We don't actually
advertise the extension yet because we want to add a couple more paths
first.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-15 19:08:26 -07:00
Aaron Watry
a8296dbd5a clover/event: Include additional event statuses for clSetEventCallback
From CL 2.0 Section 5.11 (Event Objects):
  clSetEventCallback returns CL_SUCCESS if the function is executed successfully. Otherwise, it
  returns one of the following errors:
    ...
    CL_INVALID_VALUE if pfn_event_notify is NULL or if command_exec_callback_type is
    not CL_SUBMITTED , CL_RUNNING or CL_COMPLETE .

Fixes: OpenCL CTS test_conformance/events/test_events callbacks

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-08-15 19:55:15 -05:00
Jonas Pfeil
494f86bbe5 broadcom/vc4: Port NEON-code to ARM64
Changed all register and instruction names, works the same.

v2: Rebase on build system changes (by anholt)
v3: Fix build on clang (by anholt, reported by Rob)

Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Tested-by: Rob Herring <robh@kernel.org>
2017-08-15 13:23:54 -07:00
Eric Anholt
bd5efbd70b broadcom/vc4: Build the vc4_tiling_lt_neon.c with -mfpu=neon on ARM.
If you don't pass this, the compiler refuses to compile the assembly for
pre-v7 CPUs.  This also keeps us from building identical, non-NEON code on
aarch64 and x86.

Fixes: a373f77662 ("vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.")

v2: Fix Android build by just appending NEON_C_SOURCES when
    ARCH_ARM_HAVE_NEON.

Tested-by: Rob Herring <robh@kernel.org>
2017-08-15 13:23:54 -07:00
Eric Anholt
ba8533b6ea configure.ac: Introduce HAVE_ARM_ASM/HAVE_AARCH64_ASM and the -D flags.
I've been trying to get away without these conditionals in vc4's NEON
code, but it meant compiling extra unused code on x86, and build failing
on ARMv6.

v2: Use the _arm/_arm64 flags to simplify detection (suggested by Rob),
    but hide the _arm version under ARCH_ARM_HAVE_NEON to keep from trying
    to build this stuff for armv5te.

Tested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-15 13:23:54 -07:00
Eric Anholt
b94ddc181b util: Fix build on old glibc.
We need to link librt for u_thread.h's clock_gettime() call.

Fixes: b822d9dd67 ("gallium/util: move u_queue.{c,h} to src/util")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-15 13:23:54 -07:00
Eric Anholt
f785db3d31 broadcom: Add v3d_xml.h to gitignore. 2017-08-15 13:23:54 -07:00
Eric Anholt
463de32b95 broadcom: Add missing libexpat cflags for the decoder.
The Raspbian ARMv6 cross compiler wasn't picking up my (amd64) system copy
of the header the way that the system gcc and armhf cross-compile did.
2017-08-15 13:23:54 -07:00
Dave Airlie
694d59fbaf radv/gfx9: for fast clear use is_linear flag.
The legacy test won't work on gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 06:27:30 +10:00
David Airlie
31bb8517a1 radv/gfx9: fix tile swizzle handling for gfx9
This sets the tile swizzle up properly for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:19 +10:00
David Airlie
e43cc3e3af radv/gfx9: handle GFX9 opaque metadata
port the opaque metadata changes from radeonsi for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:15 +10:00
David Airlie
674ecbfef2 radv: emit db_htile_surface reg on gfx9 as well
This is also a GFX9 register.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:09 +10:00
Dave Airlie
fc600eb98d radv/gfx9: remove some leftover gfx6 descriptor setup.
We set this later in the non-gfx9 path, just remove these
bits from here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:03 +10:00
Dave Airlie
5247b311e9 radv/gfx9: fix set predication packet.
The predication packet changed format on GFX9, update the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:52:50 +10:00
Scott D Phillips
d6539608a4 intel/genxml: Fix gen10 BLEND_STATE variable length packing
BLEND_STATE packing was modified to be variable-length in:

 9670124e31 genxml: Make BLEND_STATE command support variable length array.

The initial gen10.xml still had the old, fixed-length style
definition for BLEND_STATE. So gen10_upload_blend_state would
overwrite the packed BLEND_STATE_ENTRYs with its own fixed array
of all-zero entries when packing BLEND_STATE. This caused
BLEND_STATE upload to not work at all.

Fixes: aa416f515a ("i965/genxml: Add gen10.xml")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-08-15 09:06:29 -07:00
Timothy Arceri
fe74c8ffbf mesa: count uniform against storage when its bindless
Gallium drivers use this code path so we need to account for
bindless after all.

Fixes: 	365d34540f ("mesa: correctly calculate the storage offset for i915")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-15 23:51:35 +10:00
Marek Olšák
1ab7fed707 radeonsi: disable CE by default
It makes performance worse by a very small (hard to measure) amount.
We've done extensive profiling of this feature internally.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Christian König <christian.koenig@amd.com>
2017-08-15 15:03:43 +02:00
Dave Airlie
e0edfadec8 radeonsi: initialise imported surface to 0.
For memobj imports we weren't setting the surface to 0, which
meant sometimes we'd end up with tile_swizzle garbage, which
would corrupt rendering.

This seems to fix the image corruption on the imported memory
objects in vrdashboard for me.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-15 01:35:58 +01:00
Timothy Arceri
de0e62e106 st/mesa: correctly calculate the storage offset
When generating the storage offset for struct members we need
to skip opaque types as they no longer have backing storage.

Fixes: fcbb93e860 ("mesa: stop assigning unused storage for non-bindless opaque types")

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-15 08:20:57 +10:00
Timothy Arceri
365d34540f mesa: correctly calculate the storage offset for i915
When generating the storage offset for struct members we need
to skip opaque types as they no longer have backing storage.

Fixes: fcbb93e860 ("mesa: stop assigning unused storage for non-bindless opaque types")

V2: simplify since bindless will never be supported in this code

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101983
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-15 08:20:57 +10:00
Ben Widawsky
1efd73df39 i965: Advertise the CCS modifier
v2: Rename modifier to be more smart (Jason)

FINISHME: Use the kernel's final choice for the fb modifier

bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube none
Read bandwidth: 603.91 MiB/s
Write bandwidth: 615.28 MiB/s
bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube ytile
Read bandwidth: 571.13 MiB/s
Write bandwidth: 555.51 MiB/s
bwidawsk@norris2:~/intel-gfx/kmscube (modifiers $) ~/scripts/measure_bandwidth.sh ./kmscube ccs
Read bandwidth: 259.34 MiB/s
Write bandwidth: 337.83 MiB/s

v2: Move all references to the new fourcc code(s) to this patch.
v3: Rebase, remove Yf_CCS (Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Jason Ekstrand
51600b8489 i965/miptree: More conservatively resolve external images
Instead of always doing a full resolve, only resolve the bits that are
needed.  This means that we only do a partial resolve when the miptree
modifier is I915_FORMAT_MOD_Y_TILED_CCS.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Ben Widawsky
8f6e54c929 i965: Pretend that CCS modified images are two planes
v2: move is_aux into if block. (Jason)
Use else block instead of goto (Jason)

v3: Fix up logic for is_aux (Ben)
Fix up size calculations and add FIXME (Ben)

v4 (Jason Ekstrand):
Use the aux_pitch in the image instead of calculating it

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Jason Ekstrand
a1e5db9888 i965/screen: Support import and export of surfaces with CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Ben Widawsky
a068fdc861 i965/miptree: Allocate mcs_buf for an image's CCS
This code will disable actually creating these buffers for the scanout,
but it puts the allocation in place.

Primarily this patch is split out for review, it can be squashed in
later if preferred.

v2:
assert(mt->offset == 0) in ccs creation (as requested by Topi)
Remove bogus is_scanout check in miptree_release

v3:
Remove is_scanout assert in intel_miptree_create. It doesn't work with
latest codebase - not sure it ever should have worked.

v4:
assert(mt->last_level == 0) and assert(mt->first_level == 0) in ccs setup
(Topi)

v5 (Jason Ekstrand):
 - Base the decision to allocate a CCS on the image modifier

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Ben Widawsky
f6fbeaf1c4 i965: Support images with aux buffers
Previously images did not support any auxiliary compression surfaces
(CCS, MCS, or HiZ).  That's about to change.  This patch just adds the
fields to __DRIimageRec to make auxiliary surfaces possible.

v2 (Jason Ekstrand):
 - Add an aux_pitch parameter as well as aux_offset

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Jason Ekstrand
cf2e92262b intel/isl: Add support for I915_FORMAT_MOD_Y_TILED_CCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-08-14 10:43:30 -07:00
Jason Ekstrand
51eb40d414 i965/screen: Stop redefining DRM_FORMAT_MOD_(INVALID|LINEAR)
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-08-14 10:43:30 -07:00
Jason Ekstrand
c0e9f80cd6 drm-uapi/forcc: Pull in new modifiers
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-08-14 10:43:30 -07:00
Scott D Phillips
f7dfc44c61 i965/blorp: Correct type of src_format in call to intel_miptree_texture_aux_usage
intel_miptree_texture_aux_usage() takes an isl_format, but we are
passing a mesa_format. clang warns:

 brw_blorp.c:305:52: warning: implicit conversion from enumeration
    type 'mesa_format' to different enumeration type
    'enum isl_format' [-Wenum-conversion]
       intel_miptree_texture_aux_usage(brw, src_mt, src_format);
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~              ^~~~~~~~~~

Fixes: fc1639e46d ("i965/blorp: Use texture/render_aux_usage for blits")
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-14 10:41:54 -07:00
Julien Isorce
91d93aa621 st/va: change frame_idx from array to hash table
The picture_id was assumed to be a frame number so in 0-31.
But the vaapi client gstreamer-vaapi uses the surfaces handles
as identifier which are unsigned int.

This bug can happen when using a lot of vaapi surfaces within
the same process. Indeed Mesa/st/va increments a counter for the
surface ID: mesa/util/u_handle_table.c::handle_table_add which
starts from 0 and incremented by 1 at each call.
So creating more than 32 surfaces was a problem.

The following bug contains a test that reproduces the problem
by running a couple of vaapih264enc in the same process. The
above also explains why there was no pb when running them in
separated processes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102006
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Tested-by: Tomas Rataj <rataj28@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-and-tested-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
2017-08-14 13:40:19 +01:00
Michel Dänzer
3d8da1f678 configure: Trust LLVM >= 4.0 llvm-config --libs for shared libraries
No need to manually look for the library files anymore with current
LLVM. This sidesteps the manual method failing when LLVM was built with
-DLLVM_APPEND_VC_REV=ON.

(This might already work with older versions of LLVM)

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-14 13:37:54 +09:00
Ilia Mirkin
165e18dd21 nv50/ir: clean up saturated values immediately
Since we don't iterate to a fixed point, we can end up in situations
where we have a SAT instruction + a long immediate. This is not legal.
However since it's immediately computable, just run unary straight away
to handle the situation.

Fixes: 24a799ad35 ("nv50/ir: fix ConstantFolding with saturation")
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-08-12 14:49:08 -04:00
Ilia Mirkin
ea22ac23e0 nvc0/ir: unlink values pre- and post-call to division function
While technically correct, this can lead to e.g. getImmediate assuming
that it can walk up the value chain. It could be fixed to not do this,
but it seems easier and less error-prone to just not link the two values
to save on one LValue object.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-12 14:49:08 -04:00
Kenneth Graunke
22e1d8832c i965: Guard GetBufferSubData's streaming memcpy load with USE_SSE41
This should hopefully fix build issues on 32-bit Android-x86.

v2: s/USE_SSE4_1/USE_SS41/, caught by Gražvydas Ignotas.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102050
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-12 01:42:32 -07:00
Kenneth Graunke
da0840246f i965: Clean up intel_batchbuffer_init().
Passing screen lets us get the kernel features, devinfo, and bufmgr,
without needing container_of.

This use of container_of could cause crashes due to issues with the
"sample" macro parameter.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102062
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-08-12 01:41:24 -07:00
Marek Olšák
b420680ede gallium/radeon: only pass shader-specific debug flags to the disk shader cache
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-11 20:38:29 +02:00
Marek Olšák
d1285a7103 radeonsi/gfx9: fix the scissor bug workaround
otherwise there is corruption in most apps.

Fixes: 0fe0320 radeonsi: use optimal packet order when doing a pipeline sync

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-11 20:38:29 +02:00
Marek Olšák
27fef5d52d radeonsi/gfx9: use the VI codepath for clamping Z
This fixes corrupted shadows in Unigine Valley.
The corruption disappeared when I stopped setting IMG_DATA_FORMAT_24_8
for depth.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-11 20:38:29 +02:00
Daniel Stone
2eee03b7a1 egl: Update headers from Khronos
Taken from egl-registry 7d68647c4dab.

Signed-off-by: Daniel Stone <daniels@collabora.com>
2017-08-11 11:16:00 +01:00
Daniel Stone
7d26a52a7a egl/dri2: Allow modifiers to add FDs to imports
When using dmabuf import, make sure that the modifier is actually
allowed to add planes to the base format, as implied by the comment.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-08-11 10:25:53 +01:00
Iago Toral Quiroga
81615ad444 intel/compiler: properly size attribute wa_flags array for Vulkan
Mesa will map user defined vertex input attributes to slots
starting at VERT_ATTRIB_GENERIC0 which gives us room for only 16
slots (up to GL_VERT_ATTRIB_MAX). This sufficient for GL, where
we expose exactly 16 vertex attributes for user defined inputs, but
in Vulkan we can expose up to 28 (which are also mapped from
VERT_ATTRIB_GENERIC0 onwards) so we need to account for this when
we scope the size of the array of attribute workaround flags
that is used during the brw_vertex_workarounds NIR pass. This
prevents out-of-bounds accesses in that array for NIR shaders
that use more than 16 vertex input attributes.

Fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.*

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-11 10:41:44 +02:00
Timothy Arceri
9d41ec2182 glsl: stop cloning builtin fuctions _mesa_glsl_find_builtin_function()
The cloning was introduced in f81ede4699 to fix a problem with
shaders including IR that was owned by builtins.

However the approach of cloning the whole function each time we
reference a builtin lead to a significant reduction in the GLSL
IR compilers performance.

The previous patch fixes the ownership problem in a more precise
way. So we can now remove this cloning.

Testing on a Ryzen 7 1800X shows a ~15% decreases in compiling the
Deus Ex: Mankind Divided shaders on radeonsi (which take 5min+ on
some machines). Looking just at the GLSL IR compiler the speed up
is ~40%.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-11 15:44:15 +10:00
Timothy Arceri
77f5221233 glsl: pass mem_ctx to constant_expression_value(...) and friends
The main motivation for this is that threaded compilation can fall
over if we were to allocate IR inside constant_expression_value()
when calling it on a builtin. This is because builtins are shared
across the whole OpenGL context.

f81ede4699 worked around the problem by cloning the entire
builtin before constant_expression_value() could be called on
it. However cloning the whole function each time we referenced
it lead to a significant reduction in the GLSL IR compiler
performance. This change along with the following patch
helps fix that performance regression.

Other advantages are that we reduce the number of calls to
ralloc_parent(), and for loop unrolling we free constants after
they are used rather than leaving them hanging around.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-11 15:44:08 +10:00
Timothy Arceri
d4f79e995f glsl: use ralloc_str_append() rather than ralloc_asprintf_rewrite_tail()
The Deus Ex: Mankind Divided shaders go from spending ~20 seconds
in the GLSL IR compilers front-end down to ~18.5 seconds on a
Ryzen 1800X.

Tested by compiling once with shader-db then deleting the index file
from the shader cache and compiling again.

v2:
 - fix rebasing issue in v1

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-08-11 10:43:34 +10:00
Timothy Arceri
26f4657c3f util/ralloc: add ralloc_str_append() helper
This function differs from ralloc_strcat() and ralloc_strncat()
in that it  does not do any strlen() calls which can become
costly on large strings.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-08-11 10:43:31 +10:00
Timothy Arceri
53320e25b4 glsl: remove unused field from ir_call
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-08-11 10:43:27 +10:00
Timothy Arceri
49d9286a3f glsl: stop copying struct and interface member names
We are currently copying the name for each member dereference
but we can just share a single instance of the string provided
by the type.

This change also stops us recalculating the field index
repeatedly.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-08-11 10:43:21 +10:00
Timothy Arceri
43cbcbfee9 glsl: tidy up get_num_operands()
Also add a comment that this should only be used by the ir_reader
interface for testing purposes.

v2:
 - fix grammar in comment
 - use unreachable rather than assert

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-08-11 10:43:16 +10:00
Timothy Arceri
e2e2c5abd2 glsl: calculate number of operands in an expression once
Extra validation is added to ir_validate to make sure this is
always updated to the correct numer of operands, as passes like
lower_instructions modify the instructions directly rather then
generating a new one.

The reduction in time is so small that it is not really
measurable. However callgrind was reporting this function as
being called just under 34 million times while compiling the
Deus Ex shaders (just pre-linking was profiled) with 0.20%
spent in this function.

v2:
 - make num_operands a unit8_t
 - fix unsigned/signed mismatches

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-08-11 10:43:12 +10:00
Kenneth Graunke
5563872dbf isl: Validate row pitch of stencil surfaces.
Also, silence an obnoxious finishme that started occurring for all
GL applications which use stencil after the i965 ISL conversion.

v2: Check against 3DSTATE_STENCIL_BUFFER's pitch bits when using
    separate stencil, and 3DSTATE_DEPTH_BUFFER's bits when using
    combined depth-stencil.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-10 15:18:58 -07:00
Emil Velikov
26fbb9eacd egl: avoid eglCreatePlatform*Surface{EXT,} crash with invalid dpy
If we have an invalid display fed into the functions, the display lookup
will return NULL. Thus as we attempt to get the platform type, we'll
deref. it leading to a crash.

Keep in mind that this will not happen if Mesa is built without X11 or
when the legacy eglCreate*Surface codepaths are used.

A similar check was added with earlier commit 5e97b8f5ce ("egl: Fix
crashes in eglCreate*Surface), although it was only applicable when the
surfaceless platform is built.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-10 19:41:51 +01:00
Emil Velikov
a51be4f9a6 egl/drm: rename dri2_drm_create_surface()
The function can handle only window surfaces, so let's rename it
accordingly, killing the wrapper around it.

v2: Use native_window in the function args. list.

Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-10 19:34:04 +01:00
Emil Velikov
430a80a7b6 egl/drm: remove unreachable code in dri2_drm_create_surface()
The function can be called only when the type is EGL_WINDOW_BIT.
Remove the unneeded switch statement.

v2: Rename the local variable window to surface (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
2017-08-10 19:32:14 +01:00
Emil Velikov
794df9acad egl/x11: pass NULL instead of XCB_WINDOW_NONE as native_surface
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-10 19:30:17 +01:00
Matt Turner
9c0dad0a2b egl: Clean up native_type vs drawable mess
The next patch is going to stop passing XCB_WINDOW_NONE (of type
xcb_window_enum_t) as an argument where these functions expect a void *,
which clang does not appreciate.

This patch cleans things up to better convince me and reviewers that
it's safe to do that.

v2: Emil Velikov: rebase/integrate with series
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-10 19:29:37 +01:00
Emil Velikov
df8efd5b74 egl: handle BAD_NATIVE_PIXMAP further up the stack
The basic (null) check is identical across all backends.
Just move it to the top.

v2:
 - Split the WINDOW vs PIXMAP into separate patches
 - Move check after the dpy and config - dEQP expects so

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-10 19:28:04 +01:00
Emil Velikov
92b23683eb egl: drop unreachable BAD_NATIVE_WINDOW conditions
The code in _eglCreateWindowSurfaceCommon() already has a NULL check
which handles the condition. There's no point in checking again further
down the stack.

v2: Split the WINDOW vs PIXMAP into separate patches
v3: Resolve typos, s/EGL_PIXMAP_BIT_BIT/EGL_PIXMAP_BIT/

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-10 19:27:03 +01:00
Emil Velikov
47b06f5821 egl: add dri2_setup_swap_interval helper
The current two implementations - X11 and Wayland were identical,
barrind the upper limit.

Instead of having same code twice - introduce a helper and pass the
limit as an argument.

Thus as Android/DRM/others get support - they only need to call the
function ;-)

v2: Rebase on top of keeping ::swap_available

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
2017-08-10 19:23:31 +01:00
Tim Rowley
4d9b0dcccb configure: remove trailing "-a" in swr architecture test
Fixes "configure: line 27326: test: argument expected"

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-10 13:06:39 -05:00
Matt Turner
904d416e3d build: Fix up spirv_info.Plo
spirv_info.c existed as a static file until commit 2dd4e2ece3 began
generating it as part of the build process. autotools is incapable of
coping, and so a build-tree from before this commit would then fail with
it:

[4]: *** No rule to make target '../../../mesa/src/compiler/spirv/spirv_info.c', needed by 'spirv/spirv_info.lo'.  Stop.

Add a few lines to configure.ac to update the broken build files.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-10 13:59:32 -04:00
Marek Olšák
4630ede102 ac: fail shader compilation if libelf is replaced by an incompatible version
UE4Editor has this issue.

This commit prevents hangs (release build) or assertion failures (debug
build). It doesn't fix the editor, but catastrophic scenarios are
prevented.

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-10 13:24:23 +02:00
Thomas Hellstrom
2437ebd705 dri: Introduce SWAP_METHOD tokens
We shouldn't be using GLX tokens in the dri subsystem, so define dri
SWAP_METHOD tokens and translate when necessary. Unfortunately the X server
uses the dri swap method value untranslated as the GLX fbconfig swapMethod,
so we can't enumerate these tokens arbitrarily, but rather need to make them
have the same values as the corresponding GLX tokens.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-10 09:15:33 +02:00
Thomas Hellstrom
48bd91785a glx: Fix swap method config matching
Due to bugs in dri swap method reporting, neither the fbconfigs received from
the server nor the value reported from driconfigs were correct. Now that's been
fixed and we can enable config swapmethod matching again.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-10 09:15:33 +02:00
Thomas Hellstrom
fe4aae0e6a glx: Work around X servers reporting bogus values of GLX_SWAP_METHOD_OML
Due to the recently fixed bug where dri drivers didn't report a correct
__DRI_ATTRIB_SWAP_METHOD value, and the fact that X servers just forward this
incorrect value (from the AIGLX dri driver) untranslated as
GLX_SWAP_METHOD_OML, the latter value might be undefined when old dri AIGLX
drivers are used, which breaks client fbconfig matching with server fbconfigs.

So work around this by assuming GLX_SWAP_METHOD_UNDEFINED when a bogus value
is read.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-10 09:15:33 +02:00
Thomas Hellstrom
08bee3e5ac dri: Fix __DRIconfig reporting of __DRI_ATTRIB_SWAP_METHOD
The attribMap had two entries for this attribute, and
driGetConfigAttribIndex didn't return a proper value for this attribute.
Fix this, and also make sure we return SWAP_UNDEFINED for single-buffer
configs as required by the GLX_OML_swap_method spec.

Finally bump the dri core extension version to 2, indicating that we
correctly report __DRI_ATTRIB_SWAP_METHOD.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-08-10 09:15:33 +02:00
Dave Airlie
82ba384c10 radv: force cs/ps/l2 flush at end of command stream. (v2)
This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.

v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-09 23:19:15 +01:00
Karol Herbst
24a799ad35 nv50/ir: fix ConstantFolding with saturation
For mul(a, +-1) codegen can generate OP_MOV with a saturation flag
set which is ignored at emission. The same can happen with add(a, 0),
and others.

Adding an assert for detecting more of such issues.

Fixes wrongly rendered water in Hitman Absolution running under wine.
Also a few shaders in Mad Max and Alien Isolation produce such MOVs.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: generalize the fix for other cases]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-09 10:25:26 -04:00
Rob Herring
cc43c4a9e5 st/dri2: fix kms_swrast driconf option handling
Commit e794f8bf8b ("gallium: move loading of drirc to pipe-loader")
moved the option cache to the pipe_loader_device. However, the
screen->dev pointer is not set when dri_init_options() is called. Move
the call to after the pipe_loader_sw_probe_kms() call so screen->dev is
set. This mirrors the code flow for dri2_init_screen().

Fixes: e794f8bf8b ("gallium: move loading of drirc to pipe-loader")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-08-09 09:09:39 -05:00
Samuel Pitoiset
bbfad34606 radeonsi: drop two unused variables in create_function()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-09 12:56:00 +02:00
Eric Engestrom
5f4f5aadc3 egl: whitespace cleanup in eglapi.c
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2017-08-09 09:59:12 +01:00
Iago Toral Quiroga
0415ef9ccd TextureStorage1D should return INVALID_OPERATION if target is not a 1D texture
Previous behavior was inconsistent with other texture targets so this has been
fixed in OpenGL 4.6.

Fixes:
KHR-GL45.direct_state_access.textures_storage_errors

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-09 09:28:33 +02:00
Iago Toral Quiroga
4234b36f05 Update TextureParameter* error for incompatible texture targets
The OpenGL 4.6 specs have been updated so that GetTextureParameter*
with a texture object with an incompatible TEXTURE_TARGET should now
report INVALID_OPERATION instead of INVALID_ENUM.

Fixes:
KHR-GL45.direct_state_access.textures_parameter_errors

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-09 09:28:08 +02:00
Tapani Pälli
b65a91e582 egl/dri2: refactor dri2_query_surface, swrastGetDrawableInfo
Currently swrastGetDrawableInfo always initializes w and h, patch
refactors function as x11_get_drawable_info that returns success and
sets the values only if no error happened. Add swrastGetDrawableInfo
wrapper function as expected by DRI extension.

v2: init w,y,w,h in swrastGetDrawableInfo (Eric)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reported-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-09 08:42:11 +03:00
Kenneth Graunke
a1c9a6da18 i965/bufmgr: Set bo->idle after waiting.
After a successful wait, we know the buffer ought to be idle.

Chris points out that: "The only caveat here is that bo is global, and
we have a very unlikely (and probably unnoticeable) race condition with
multiple contexts."

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-08 16:45:15 -07:00
Kenneth Graunke
58a4fc2b00 i965: Don't use ggtt_bo for Gen8+ streamout offset buffer.
RELOC_NEEDS_GGTT is only meaningful on Sandybridge - it's skipped on
other generations - so this has no purpose.  Just use rw_bo().

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-08 14:26:24 -07:00
Kenneth Graunke
a8b36fbdfa i965: Simplify *_bo() helpers.
With the reloc domains gone, most of these are basically the same,
and the names don't make much sense anymore.  Simplify them to ro_bo(),
rw_bo(), and ggtt_bo().

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-08 14:26:24 -07:00
Kenneth Graunke
2a0b3c781c i965: Get rid of KSP_ro
The GPU reads the shader kernel from the program cache BO.  It never
writes it, so using a read-write BO reference makes no sense.

Just make KSP read-only, and drop KSP_ro.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-08 14:26:24 -07:00
Connor Abbott
c12c2e40a3 ac/nir: fix saturate emission
The .f32 was already getting added by emit_intrin_2f_param(). Noticed
when enabling LLVM module verification.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-08 11:58:21 -07:00
Jason Ekstrand
be0e13e49f i965: Only call create_for_planar_image for multiple planes
Before, we ended up always calling miptree_create_for_planar_image in
almost all cases because most images have image->planar_format != NULL.
This commit makes us only take that path if we have a multi-planar
format.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-08-08 09:32:20 +01:00
Timothy Arceri
da154786ce mesa: don't error check the default buffer object
An allocation check is already done when the buffer is created at
context creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-08 15:56:12 +10:00
Timothy Arceri
dae1e6ad11 mesa: check default buffer object creation was successful
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-08 15:56:12 +10:00
Timothy Arceri
da10065d2b mesa: add NULL checking to free_shared_state()
This will allow us to call this function from
_mesa_alloc_shared_state() in the case that we run out of memory
part way through allocating the state.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-08 15:56:12 +10:00
Ilia Mirkin
8614679e78 glapi: per the extension spec, the EXT-suffixed function should be used
We already expose glMultiDrawElementsBaseVertexEXT as part of the
EXT_draw_elements_base_vertex chunk, so this one can just be removed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-07 20:32:37 -04:00
Ilia Mirkin
76ce7f03e7 include: update GLES gl2ext header to no longer reference bad function
There was a previous error in the gl.xml and generated files that
referenced glMultiDrawElementsBaseVertexOES. This function should not
exist, only the EXT-suffixed version should.

Leaving the other headers alone to avoid conflicts with GL 4.6 work.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-07 20:32:32 -04:00
Bas Nieuwenhuizen
bfed189ee0 radv: remove semicolon in if(...);
Trivial.

Fixes: a6a6146aa9 "radv: Don't allow fmask swizzling for shareable images."
2017-08-08 00:01:47 +02:00
Alex Smith
2e9a13bf22 radv: Fix decompression on multisampled depth buffers
Need to take the sample count into account in the depth decompress and
resummarize pipelines and render pass.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 23:47:49 +02:00
Bas Nieuwenhuizen
a6a6146aa9 radv: Don't allow fmask swizzling for shareable images.
Also adds an assert because you never know how the winsys changes, and
multiprocess format differences are annoying.

Fixes: 1e696b962b "radv: add separate fmask tile swizzle counter."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-07 23:44:59 +02:00
Marek Olšák
a2703fc119 radeonsi: fix a compile failure due to disabled asserts 2017-08-07 22:51:45 +02:00
Marek Olšák
0fe0320dc0 radeonsi: use optimal packet order when doing a pipeline sync
Process most new SET packets in parallel with previous draw calls, then
flush caches and wait, start the draw, and do L2 prefetches last.

This decreases the [CP busy / SPI busy] ratio (verified with GRBM perf
counters). In other words, the time window when shaders are idle (between
(the wait and the draw) is much shorter now.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
895de1d03d radeonsi: expose the number of decompress calls to the HUD
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
ca440bc651 gallium/radeon: rename GPU-dma-busy -> GPU-cp-dma-busy
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
c093821cee radeonsi: rename shader_userdata -> shader_pointers where appropriate
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
c441999b7a radeonsi: prefetch VBO descriptors after the first VGT shader
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
e887c68bd2 radeonsi: add a separate dirty mask for prefetches
so that we don't rely on si_pm4_state_enabled_and_changed, allowing us
to move prefetches after draw calls.

v2: ckear the dirty mask after unbinding shaders

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-08-07 21:12:24 +02:00
Marek Olšák
a7b0014d1a radeonsi: add and use si_pm4_state_enabled_and_changed
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
58d062b87d radeonsi: de-atomize L2 prefetch
I'd like to be able to move the prefetch call site around.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
4e629ca7c7 radeonsi: align all CE dumps to L2 cache line size
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
01fed67608 radeonsi: remove a tautology sctx->framebuffer.nr_samples >= 1
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Marek Olšák
1694a8ba8d gallium/radeon: print all members of radeon_info with R600_DEBUG=info
also set max_alignment on amdgpu.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-07 21:12:24 +02:00
Samuel Pitoiset
269c37a676 glsl: update the extensions/functions that are enabled for 460
Other ones are either unsupported or don't have any helper
function checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-08-07 21:06:54 +02:00
Gurchetan Singh
12181b5017 egl/dri2: add image extension such it's usable by swrast driver
Otherwise, this extension is not visible to the EGL users who
use the swrast driver.

This will allow the swrast driver to use eglCreateImageKHR,
provided the target is EGL_GL_TEXTURE_2D_KHR or
EGL_GL_RENDERBUFFER_KHR.  Note we still have to implement the
create from render buffer path.

v2: add it to optional_core_extensions instead of swrast_core_extensions,
    so it's not a requirement (Emil)
v3: Merge egl/dri2 changes together, also add support for
    platform_wayland (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2017-08-07 18:17:17 +01:00
Gurchetan Singh
bbdeddd5fd st/dri: add drisw image extension
Since the revelant functions have been moved to dri_helpers,
drisw.c can make use of the extension. Note we have version 6
of the extension, since we want to support createImageFromTexture.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-07 18:16:52 +01:00
Gurchetan Singh
12fcdc4ba0 st/dri: move some image functions to dri_helpers.c
These functions will be used both by drisw.c and
dri2.c. This patch also moves some headers that can
be shared.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-07 18:16:48 +01:00
Gurchetan Singh
18eb3bdb85 st/dri: organize order of includers in dri_helpers
Although it doesn't seem like a strict requirement of the
code base, we do it when possible and it looks nice.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-07 18:16:45 +01:00
Gurchetan Singh
1825280128 st/dri: change dri_extensions to dri_helpers
These files provide helper structs and functions for dri2.c and drisw.c,
and name change better conveys that.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-07 18:15:13 +01:00
Jason Ekstrand
e7a52cc381 i965/miptree: Set supports_fast_clear = false in make_shareable
The make_shareable function deletes the aux buffer and then whacks
aux_usage to ISL_AUX_USAGE_NONE but not unsetting supports_fast_clear.
Since we only look at supports_fast_clear to decide whether or not to do
fast clears, this was causing assertion failures.

Reported-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101925
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
24a0da338f i965/miptree: Rework create flags
The only one of the three remaining flags that has anything whatsoever
to do with layout is TILING_NONE.  This commit renames them to
MIPTREE_CREATE_*, documents the meaning of each flag, and makes the
create functions take an actual enum type so GDB will print them nicely.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
55116839d9 i965/miptree: Delete MIPTREE_LAYOUT_TILING_(Y|ANY)
The only force tiling flag we really care about is LAYOUT_TILING_NONE.
The others don't actually do anything but add confusion.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
1779499166 i965/miptree: Delete an unused function declaration
The implementation of brw_miptree_layout was removed in bf24c3539e.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
8e5808fc0c i965/miptree: Call alloc_aux in create_for_bo
Originally, I had moved it to the caller to make some things easier when
adding the CCS modifier.  However, this broke DRI2 because
intel_process_dri2_buffer calls intel_miptree_create_for_bo but never
calls intel_miptree_alloc_aux.  Also, in hindsight, it should be pretty
easy to make the CCS modifier stuff work even if create_for_bo allocates
the CCS when DISABLE_AUX is not set.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
a5a673dfa7 i965/miptree: Delete MIPTREE_LAYOUT_FOR_SCANOUT
The flag hasn't affected actual surface layout for some time.  The only
purpose it served was to set bo->cache_coherent = false on the BO used
to create the miptree.  This is fairly silly because we can just set
that directly from the caller where it makes much more sense.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
2bca18be44 i965/miptree: Delete some unused layout flags
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
7659f8c904 i965/miptree: Refactor is_mcs_supported
We rename it to intel_miptree_supports_mcs and make the function
signature match intel_miptree_supports_ccs/hiz.  We also move the sample
count check into the function so it returns false for single-sampled
surfaces.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
0e4d9a4b37 i965/miptree Remove layout_flags parameter form is_mcs_supported
The one caller of is_mcs_supported passes 0 in as the layout_flags
unconditionally.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
4d27c6095e intel/isl: Don't align the height of the last array slice
We were calculating the total height of 2D surfaces by multiplying the
row pitch by the number of slices.  This means that we actually request
slightly more space than actually needed since the padding on the last
slice is unnecessary.  For tiled surfaces this is not likely to make a
difference.  For linear surfaces, on the other hand, this means we may
require additional memory.  In particular, this makes the i965 driver
reject EGL imports of buffers which do not have this extra padding.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
c15b92ce11 intel/isl: Stop padding surfaces
The docs contain a bunch of commentary about the need to pad various
surfaces out to multiples of something or other.  However, all of those
requirements are about avoiding GTT errors due to missing pages when the
data port or sampler accesses slightly out-of-bounds.  However, because
the kernel already fills all the empty space in our GTT with the scratch
page, we never have to worry about faulting due to OOB reads.  There are
two caveats to this:

 1) There is some potential for issues with caches here if extra data
    ends up in a cache we don't expect due to OOB reads.  However,
    because we always trash the entire cache whenever we need to move
    anything between cache domains, this shouldn't be an issue.

 2) There is a potential issue if a surface gets placed at the very top
    of the GTT by the kernel.  In this case, the hardware could
    potentially end up trying to read past the top of the GTT.  If it
    nicely wraps around at the 48-bit (or 32-bit) boundary, then this
    shouldn't be an issue thanks to the scratch page.  If it doesn't,
    then we need to come up with something to handle it.

Up until some of the GL move to ISL, having the padding code in there
just caused us to harmlessly use a bit more memory in Vulkan.  However,
now that we're using ISL sizes to validate external dma-buf images,
these padding requirements are causing us to reject otherwise valid
images due to the size of the BO being too small.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 09:31:11 -07:00
Jason Ekstrand
06d3115bb9 anv/formats: Allow sampling on depth-only formats on gen7
We can't sample from depth-stencil formats but on gen7 but we can sample
from depth-only formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102024
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-08-07 08:27:09 -07:00
Emil Velikov
4468764ef0 docs: drop released RCs from the calendar
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-07 15:41:09 +01:00
Emil Velikov
165be830fd docs: update calendar, add news item and link release notes for 17.1.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-07 13:14:38 +01:00
Emil Velikov
6dd9b9cd4a docs: add sha256 checksums for 17.1.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2766ed0d45)
2017-08-07 13:10:59 +01:00
Emil Velikov
ad81c7e4bf docs: add release notes for 17.1.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3d48433078)
2017-08-07 13:10:58 +01:00
Dave Airlie
8bf3930751 radv: fix MSAA on SI gpus.
This ports the workaround from radeonsi, that was missing in radv.

This fixes Talos rendering when MSAA is enabled on my Tahiti card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-07 08:38:14 +01:00
Eleni Maria Stea
9f59cb2cda docs: removed the '--with-sha1' requirement from shading.html
The configuration option --with-sha1 is no longer required for the
MESA_SHADER_READ_PATH, MESA_SHADER_DUMP_PATH environment variables
to take effect.

1- removed the "--with-sha1" sentence from docs/shading.html
2- added an extra note: that the corresponding dumped and replacement
shaders must have the same filenames for the feature to take effect.

Acked-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-07 10:20:04 +03:00
Dave Airlie
1e696b962b radv: add separate fmask tile swizzle counter.
This mirrors what Marek has done for radeonsi, and uses
a separate counter to handle the fmask surface for MSAA
MRTs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-07 00:08:43 +01:00
Dave Airlie
3f389f75b6 radv: fix f16->f32 denorm handling for SI/CIK. (v2)
This just copies the code from the -pro shaders,
and fixes the tests on CIK.

With this CIK passes the same set of conformance
tests as VI.

Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-07 00:00:05 +01:00
Wladimir J. van der Laan
948bb2caba etnaviv: Add support for R8_UNORM textures
R8_UNORM textures can be emulated by means of L8 and a swizzle.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-08-06 20:45:24 +02:00
Wladimir J. van der Laan
39056b0e2a etnaviv: Implement ICACHE
This patch adds support for large shaders on GC3000. For example the "terrain"
glmark benchmark with a large fragment shader will work after this.

If the GPU supports ICACHE, shaders larger than the available state area will
be uploaded to a bo of their own and instructed to be loaded from memory on
demand. Small shaders will be uploaded in the usual way. This mimics the
behavior of the blob.

On GPUs that don't support ICACHE, this patch should make no difference.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-08-06 20:44:02 +02:00
Wladimir J. van der Laan
6c321c8b0b etnaviv: Unified uniforms support
GC3000 has changed from a separate store for VS and PS uniforms
to a single, unified one. There is backwards compatibilty functionalty,
however this does not work correctly together with ICACHE.

This patch adds explicit support, although in the simplest way possible:
the PS/VS uniforms split is still fixed and hardcoded. It should
make no difference on hardware that does not have unified uniform
memory.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-08-06 20:43:57 +02:00
Wladimir J. van der Laan
9c04c88830 etnaviv: Update headers from rnndb
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-08-06 20:43:48 +02:00
Ilia Mirkin
35d7145fa6 fix GL_ARB_spirv_extensions name
Trivial. There is no _gl_ in there.
2017-08-06 13:25:13 -04:00
Bas Nieuwenhuizen
acba3a3151 radv: Use the correct channel for alpha in resolve srgb conversion.
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:07:13 +02:00
Bas Nieuwenhuizen
15e5a7a683 radv: Only convert linear->srgb in compute resolves.
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:07:09 +02:00
Bas Nieuwenhuizen
8286c3a49f radv: Don't use SRGB format for image stores during resolve.
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:06:50 +02:00
Timothy Arceri
75fd4d8fd3 docs: add EXT_memory_object and EXT_memory_object_fd to relnotes 2017-08-06 12:51:12 +10:00
Andres Rodriguez
7fe5fa0013 radeonsi: enable support for EXT_memory_object
v2: fix an indentation error
v3: don't enable for r600

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
14cad8786a radv: generate the same driver UUID as radeonsi
These need to match for interop compatibility queries.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
f8ea71f047 radv: generate same device UUID as radeonsi
This is required for interop use cases. The same device must report
identical UUIDs through the GL and Vulkan APIs so that users can
identify when it is safe to perform a memory object import.

v2: use ac helpers to calculate the uuid

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
059d82c1c2 mesa: hook up queries for NUM_TILING_TYPES and TILING_TYPES
These are just basic implementations.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
68623933a0 radeonsi: hook up device/driver UUID queries
v2: move from r600_common to radeonsi

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
6130c8e6e7 ac/gpu: add driver/device UUID query helpers
We need vulkan and gl to produce the same UUIDs. Therefore we should
keep the mechanism to compute these in a common location to guarantee
they are updated in lockstep.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
b2aaa91e8d mesa: hook up UUID queries for driver and device
v2: respective changes for new gallium interface
v3: fix UUID size asserts

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
95cb776049 gallium: introduce device/driver UUID queries
v2: remove unnecessary returns
v3 (Timothy Arceri): updated trace
v4 (Timothy Arceri): actually dump the params in trace

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
e064d66020 mesa: implement glGetUnsignedByte{v|i_v}
These are used by EXT_external_objects to present UUIDs for the device
and the driver.

v2 (Timothy Arceri):
 - remove extra break
 - use _mesa_problem() rather the _mesa_error() for unimplemented
   support for value types

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
921bdf1b6d mesa/st: expose EXT_memory_object and EXT_memory_object_fd
v2: use PIPE_CAP_MEMOBJ to guard the extension

v3 (Timothy Arceri):
 - expose extensions via the cap_mappings array

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Timothy Arceri
ba6eee218f mesa: hook up (Named)BufferStorageMem api
Include no_error variants as well.

v2 (Timothy Arceri):
 - reduced code churn by squashing some changes into
   previous commits

v3 (Timothy Arceri):
 - drop unused function declaration

v4 (Timothy Arceri):
 - fix Driver function assert()
 - add missing GL errors

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
bbc9c2e4f8 mesa/st: implement memory objects as a backend for buffer objects
Use a memory object instead of user memory.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Dave Airlie
2bdb0da030 radeonsi: add basic memory object support
v2: also consider gfx9 metadata
v3: ref/unref memobj->buf
v4: add refcount comment

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
ddf2c830a6 radeonsi: factor out metadata import
Plumbing for importing memobj backed textures.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-06 12:42:06 +10:00
Dave Airlie
7683540029 mesa/st: implement memory objects as a backend for texture storage
Instead of allocating memory to back a texture, use the provided memory
object.

v2: split off extension exposure logic
v3: de-duplicate code with st_AllocTextureStorage

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
999653e398 mesa/st: factor out st_AllocTextureStorage into a helper
Plumbing for using memory objects as texture storage.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
d0aac1b0aa mesa: hook up memory object multisamples tex(ture)storage api
V2 (Timothy):
 - error check memory == 0 before lookup

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
fc790c50cc mesa: hook up memoryobject tex(ture)storage api
V2 (Timothy Arceri):
 - formating fixes

V3 (Timothy):
 - error check memory == 0 before lookup

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Dave Airlie
49f4ecc677 mesa/st: start adding memory object support
v2: pass dedicated flag

v3 (Timothy Arceri):
 - remove unrequired _mesa_init_memory_object_functions()
   call in the state tracker.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Dave Airlie
714dfaae72 gallium: introduce memory object
v2: fix comment regarding fd ownership, define pipe_memory_object
v3: remove stray return
v4 (Timothy Arceri): update trace
v5 (Timothy Arceri): actually dump the params in trace

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v3)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
1e8e4ee230 mesa: add support for memory object parameters
V2 (Timothy Arceri):
 - fix copy and paste error with error message

V3 (Timothy Arceri):
 - drop the Protected field for now as its unused

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
8b7c574479 mesa: add support for memory object creation/import/delete
Used by EXT_external_objects and EXT_external_objects_fd

V2 (Timothy Arceri):
 - Throw GL_OUT_OF_MEMORY error if CreateMemoryObjectsEXT()
   fails.
 - C99 tidy ups
 - remove void cast (Constantine Kharlamov)

V3 (Timothy Arceri):
 - rename mo -> memObj
 - check that the object is not NULL before initializing
 - add missing "EXT" in function error message

V4 (Timothy Arceri):
 - remove checks for (memory objecy id == 0) and catch in
   _mesa_lookup_memory_object() instead.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Andres Rodriguez
322ee1b363 mapi: add EXT_external_objects and EXT_external_objects_fd
Includes implementation stubs.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-06 12:42:06 +10:00
Aaron Watry
293b3e0a3f clover/device: Move device_version into core and add device_clc_version
The device version is the maximum CL version that the device supports.

device_version and device_clc_version are not necessarily the same for
devices that support CL 1.0, but have a 1.1 compiler and the necessary
extensions.

Eventually, this will be based on the features/extensions of the actual
device, but for now move it a bit closer to its eventual destination.

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesey <jan.vesely@rutgers.edu>
2017-08-05 19:50:30 -05:00
Dave Airlie
36a1b61321 radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.

v2: get it right, reverse the polarity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-05 03:52:44 +01:00
Emil Velikov
9777c4234b loader: drop the [gs]et_swap_interval callbacks
Having two callbacks to manage a single int seems like an overkill.
Use a cached copy and update that when needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
Might want to look if the dimensions dance in .query_surface ...
speaking of which close to nobody implements that ...
2017-08-04 23:57:22 +01:00
Emil Velikov
c961b679fe egl/x11: don't leak xfixes_query in the error path
If we get a xfixes v1.x we'll error out, without freeing the
xfixes_query reply.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-04 23:56:33 +01:00
Emil Velikov
10e7c2c64d loader: rework xmlconfig dependency
Currently xmlconfig is conditionally used, only when --enable-dri is
available.

As the library has moved to src/util and has wider wisebase, this guard
is no longer correct. Strictly speaking - it wasn't since the
introduction of xmlconfig into st/nine a while ago.

Unconditionally enable xmlconfig and drop the linking. As said before
there's other users of the library, so depending on the configure
options we will get multiple definitions of said symbols.

NOTE: To avoid breaking other combinations, this commit adds the
xmlconfig link to the required places - throughout gallium and the DRI
loaders.

Cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-08-04 23:54:52 +01:00
Chris Wilson
6c530ad116 i965: Reduce passing 2x32b of reloc_domains to 2 bits
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
execobject.flags. We can simplify our reloc api by also removing the
unused read/write domains and only pass the resultant flags.

The caveat to the above are when we need to make the kernel aware that
certain objects need to take into account different work arounds.
Previously, this was done using the magic (INSTRUCTION, INSTRUCTION)
reloc domains. NO_RELOC requires this to be passed in the execobject
flags as well, and now we push that up the callstack.

The API is more compact, more expressive of what happens underneath, but
unfortunately requires more knowledge of the system at the point of use.
Conversely it also means that knowledge is specific and not generally
applied and so not overused.

   text	   data	    bss	    dec	    hex	filename
8502991	 356912	 424944	9284847	 8dacef	lib/i965_dri.so (before)
8500455	 356912	 424944	9282311	 8da307	lib/i965_dri.so (after)

v2: (by Ken) Rebase.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Kenneth Graunke
2aacd22c0b i965: Convert reloc.target_handle into an index for I915_EXEC_HANDLE_LUT
Based on a patch by Chris Wilson (who also wrote this commit message).

Passing the index of the target buffer via the reloc.target_handle is
marginally more efficient for the kernel (it can avoid some allocations,
and can use a direct lookup rather than a hash or search). It is also
useful for ourselves as we can use the index into our exec_bos for other
tasks.

v2: Only enable HANDLE_LUT if we can use BATCH_FIRST and thereby avoid
a post-processing loop to fixup the relocations.
v3: Move kernel probing from context creation to screen init.
Use batch->use_exec_lut as it more descriptive of what's going on (Daniel)
v4: Kernel features already exists, use it for BATCH_FIRST
Rename locals to preserve current flavouring
v5: Squash in "always insert batch bo first"
v6: (by Ken) Split out BATCH_FIRST from HANDLE_LUT.
2017-08-04 10:26:37 -07:00
Kenneth Graunke
4d26c77a71 i965: Use a C99 initializer for new validation list entries.
More succinct - we can skip a bunch of = 0 lines.

Extracted from a patch by Chris Wilson.
2017-08-04 10:26:37 -07:00
Kenneth Graunke
68d611ed8e i965: Simplify some bo != batch->bo special cases.
Extracted from a patch by Chris Wilson.

Now that the batch is always at the front of the validation list,
we don't need to special case it - the usual "go find an existing BO"
code will work just fine.
2017-08-04 10:26:37 -07:00
Kenneth Graunke
29ba502a4e i965: Use I915_EXEC_BATCH_FIRST when available.
This will make it easier to use I915_EXEC_HANDLE_LUT.

Based on a patch by Chris Wilson.
2017-08-04 10:26:37 -07:00
Chris Wilson
e24f3fb7c8 i965: Move add_exec_bo()
To avoid a forward declaration in the next patch, move the definition of
add_exec_bo() earlier.

v2: (by Ken) redo move.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Chris Wilson
ba9b71e56a i965: Ignore reloc read/write domains
Since before the kernel supported I915_EXEC_NO_RELOC, long before our
minimum kernel requirement, the kernel unconditionally invalidated all
GPU TLBs before a batch and flushed all GPU caches after a batch. At
that moment, the only use for read/write domain was for activity
tracking, ensuring that future reads waited for the last writer and
future writes waited for all reads. This only requires a single bit in
the execbuf interface which can be supplied via the NO_RELOC interface,
making the use of relocation domains entirely redundant.

Trimming the excess writes into the array allows the compiler to be much
more frugal:

   text	   data	    bss	    dec	    hex	filename
8493790	 357184	 424944	9275918	 8d8a0e	i965_dri.baseline
8493758	 357184	 424944	9275886	 8d89ee	i965_dri.so

(This text improvement really does come from dropping domains, not from
the new use of C99 initializers.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Chris Wilson
3f353342a6 i965: Use I915_EXEC_NO_RELOC
If we correctly fill the batch with the right relocation value, and that
matches the expected location of the object, we can then tell the kernel
it can forgo checking each individual relocation by only checking
whether the object moved.

v2: Rebase to apply ahead of I915_EXEC_HANDLE_LUT

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Kenneth Graunke
12a77f391f i965: Initialize flags to 0 and |= in new flags.
This makes it a bit easier to add new unconditional flags.
2017-08-04 10:26:37 -07:00
Kenneth Graunke
cf412f3afe i965: Make add_exec_bo return the validation list index.
This will be useful for I915_EXEC_HANDLE_LUT and I915_EXEC_NO_RELOC.
2017-08-04 10:26:37 -07:00
Chris Wilson
00f822ddfd i965: Track last location of bo used for the batch
Borrow a trick from anv, and use the last known index for the bo to skip
a search of the batch->exec_bo when adding a new relocation. In defence
against the bo being used in multiple batches simultaneously, we check
that this slot exists and points back to us.

v2: Also update brw_batch_references()
v3: Reset bo->index on creation (Daniel)
v4: Improved explanation of bo->index (Kenneth)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Chris Wilson
2410deefff i965: Always use the pre-computed offset for the relocation entry
We must be careful to only compute the address once based on the
per-context information (rather than accessing the unlocked global
bo->offset64) so that the value in the batch does match the
reloc.presumed_offset we declare to the kernel. Otherwise, highly
unlikely, but we may see GPU hangs in multithreaded users.

The only real complication here is isl_surf_fill_state() which needs to
adjust the reloc.delta to both general a tile offset and to encode state
into the lower 12 bits.

(Rebased on ISL changes by Ken.)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-04 10:26:37 -07:00
Kenneth Graunke
1d0bd0d174 i965: Make brw_emit_reloc assert that the target BO is non-NULL.
You need an actual BO to emit a relocation to it.

Suggested by me, authored by Chris, split out of a larger patch.
2017-08-04 10:26:37 -07:00
Emil Velikov
5c007203b7 configure.ac: drop manual detection of expat header/library
Use the .pc file, as provided by version prior 2.1.0 onward and dropping
the manual header/library check.

Version 2.1.0 was released back in Mar 2012 and all major distributions
use it.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (IRC)
2017-08-04 14:58:50 +01:00
Emil Velikov
6f9298dbde configure.ac: unconditionally check for expat
Earlier commits moved the xmlconfig library to a wider userbase.
Thus having the check within --enable-dri is insufficient.

Upon closer look, nine needed it from it's early days - 948e6c5228
("nine: Add drirc options (v2)")

Fixes: 601093f95d ("xmlconfig: move into src/util")
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (IRC)
2017-08-04 14:58:50 +01:00
Mauro Rossi
f99a733e38 android: radeonsi: add nir include paths
Android build changes to avoid the following building error:

target  C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_pipe.c
...
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.c:38:
external/mesa/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found
         ^
1 error generated.

Fixes: da62a31c5b "radeonsi: add nir include paths"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-04 14:58:50 +01:00
Chris Wilson
b4f639d02a i965: Prefer using streaming reads from WC mmaps
For buffer objects, where we primarily expect to be writing to them and
so already have a WC mmap (for !llc access) reusing the existing mmap
and keeping the buffer out of the CPU cache seems preferable.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-04 12:06:44 +01:00
Nicolai Hähnle
27ba094a4a pipe-loader: fix swrast probing
Missed updating this caller of pipe_loader_find_module.

Fixes: 0d7d60b7ea ("pipe-loader: pass only the driver_name to pipe_loader_find_module")
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-04 10:46:27 +02:00
Nicolai Hähnle
ae7283dcbc pipe-loader: remove config from pipe_loader_create_screen
The config passed into the screen should be independent from the state
tracker, because at least in the case of radeonsi, the screen structure
can be shared between different state trackers.

Incidentally, this also fixes crashes that were recently introduced.

Fixes: a35a9e7c ("gallium: add driconf options to pipe_screen_config")
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-04 10:46:24 +02:00
Nicolai Hähnle
9fb8476e67 gallium: get rid of pipe_screen_config::flags
They were set only by the DRI state tracker, which is problematic
when radeonsi is used with different state trackers in the same
process.

Also, we don't need them anymore.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-04 10:46:20 +02:00
Nicolai Hähnle
12ce39d3de radeonsi: set drirc compiler options before calling common screen init
Also, access the options directly, allowing us to get rid of the
PIPE_SCREEN_xxx flags.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-04 10:46:01 +02:00
Juan A. Suarez Romero
3b5743ead5 util: Makefile.am: add merge_driinfo.py in extra dist
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-04 09:54:46 +02:00
Juan A. Suarez Romero
5ff4c5aef4 radeonsi: Makefile.sources: include driinfo_radeonsi.h
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-04 09:54:46 +02:00
Juan A. Suarez Romero
86c68e0a33 anv: Makefile.vulkan.am: ICD json files are now generated with python
Commit 0ab04ba979 (anv: Use python to generate ICD json files) changed
the way ICD json files are created.

Remove the old .in files from extra dist, and add the python script.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-04 09:54:46 +02:00
Dave Airlie
fc625ba072 radv: also fix texture image descriptors for mipmap tile swizzle
This fixes the image descriptors for mipmapped tile swizzle

Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-04 07:13:40 +01:00
Dave Airlie
a6b4f04d9b radv: fix tile swizzle regression on mipmaps.
When Marek enabled mipmapped swizzle, radv didn't
have the code in place to handle it. This fixes the
regression.

I'll look more into GFX9 once I have a vega card (soon).
Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-04 06:45:36 +01:00
Michel Dänzer
b73d8d4547 pipe-loader: Add driver build directory for si_driinfo.h include path
Fixes out-of-tree build failure:

.../src/gallium/targets/pipe-loader/pipe_radeonsi.c: In function ‘drm_configuration’:
.../src/gallium/targets/pipe-loader/pipe_radeonsi.c:38:33: fatal error: radeonsi/si_driinfo.h: No such file or directory
 #include "radeonsi/si_driinfo.h"
                                 ^
compilation terminated.
Makefile:994: recipe for target 'pipe_radeonsi.lo' failed
make[4]: *** [pipe_radeonsi.lo] Error 1

Trivial.

Fixes: 0f8c5de869 ("radeonsi: prepare for driver-specific driconf
                        options")
2017-08-04 11:49:46 +09:00
Jan Vesely
08f44a497c clover: Fix build after llvm r309911
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-08-03 20:59:16 -04:00
Marek Olšák
da942a4b81 radeonsi: program tile swizzle for color and FMASK surfaces for GFX & SDMA
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
ae5d86e94d radeonsi: if FMASK is disabled, set CB_COLORi_FMASK = CB_COLORi_BASE properly
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
7726092795 gallium/radeon: reallocate textures with non-zero tile_swizzle on export
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
4a758a17da winsys/amdgpu: enable computation of tile swizzle
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
c60c5accd1 ac/surface: align DCC size for surfaces that use tile swizzle
Note that dcc_alignment = pipe_interleave_bytes * num_pipes * num_banks,
which is greater than the previous open-coded alignment.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
0141beadd8 ac/surface: limit tile swizzle to non-mipmaps on SI
Mipmapping with tile swizzle doesn't work.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
2b7e85562a ac/surface: enable tile swizzle for mipmapped textures
The tile swizzle computation was done after the whole miptree was computed,
but that was too late, because at that point AddrSurfInfoOut contained
information about the smallest miplevel, which is never 2D-tiled.

The correct way is to do the computation before the second level is computed.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
6fb382d9fb ac/surface: set structure size and handle errors for AddrComputeBaseSwizzle
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
59144d4bf5 ac/surface: increment surf_index only when tile swizzle is allowed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
9059400247 ac/surface: compute tile swizzle only when it's allowed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
4e757d591d ac/surface: add RADEON_SURF_SHAREABLE
Shareable textures won't use tile swizzle.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
d311e837f4 ac/surface: remove RADEON_SURF_HAS_TILE_MODE_INDEX
it's useless

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
4662e45350 ac/surface: move tile_swizzle to ac_surface and document it
Gfx9 will use it too.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Brian Paul
6839d33699 st/mesa: fix handling of NumSamples=1 (v2)
In Mesa we use the convention that if gl_renderbuffer::NumSamples
or gl_texture_image::NumSamples is zero, it's a non-MSAA surface.
Otherwise, it's an MSAA surface.  But in gallium nr_samples=1 is a
non-MSAA surface.

Before, if the user called glRenderbufferStorageMultisample() or
glTexImage2DMultisample() with samples=1 we skipped the search for the
next higher number of supported samples and asked the gallium driver to
create a surface with nr_samples=1.  So we got a non-MSAA surface.
This failed to meet the expection of the user making those calls.

This patch changes the sample count checks in st_AllocTextureStorage()
and st_renderbuffer_alloc_storage() to test for samples > 0 instead of > 1.
And we now start querying for MSAA support at samples=2 since gallium has
no concept of a 1x MSAA surface.

A specific example of this problem is the Piglit arb_framebuffer_srgb-blit
test.  It calls glRenderbufferStorageMultisample() with samples=1 to
request an MSAA renderbuffer with the minimum supported number of MSAA
samples.  Instead of creating a 4x or 8x, etc. MSAA surface, we wound up
creating a non-MSAA surface.

Finally, add a comment on the gl_renderbuffer::NumSamples field.

There is one piglit regression with the VMware driver:
ext_framebuffer_multisample-blit-mismatched-formats fails because
now we're actually creating 4x MSAA surfaces (the requested sample
count is 1) and we're hitting some sort of bug in the blitter code.  That
will have to be fixed separately.  Other drivers may find regressions
too now that MSAA surfaces are really being created.

v2: start quering for MSAA support with samples=2 instead of 1.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-03 14:13:57 -06:00
Brian Paul
426673e271 gallium/docs: add more info about TXF and MSAA textures
If the texture is multisampled, the coord.w component indicates which
sample to fetch.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-03 14:13:57 -06:00
Brian Paul
9e94aa7758 st/mesa: minor clean-ups in st_atom_msaa.c
Whitespace, formatting, combine nr_bits assignment with declaration.
Trivial.
2017-08-03 14:13:57 -06:00
Brian Paul
722ba1ad19 gallium/docs: document automatic per-sample FS execution
Both the GLSL 4.00 specs and DX10.1 specs specify that if a fragment
shader uses the sample ID or sample position inputs, the shader is
automatically run at per sample frequency.  Document that expectation
for gallium fragment shaders.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-03 14:13:57 -06:00
Brian Paul
6c46caedab mesa: init more msaa fields
The default values for GL_SAMPLE_SHADING and GL_MIN_SAMPLE_SHADING_VALUE
are missing from the state tables in the GL spec, but they're supposed
to be GL_FALSE and 0.0, per the GL_ARB_sample_shading spec.

Add code for that, just to be explicit.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-03 14:13:57 -06:00
Chuck Atkins
f0da70a964 swr: Add arch flags to support Cray and PGI compilers
Note that the Cray flags (-target-cpu=) need to come first since the
cray programming environment uses wappers around other compilers.  By
checking the wrapper flags first, you can be sure to match the wrapper
flag instead of the underlying compiler (gcc, intel, pgi, etc.) flags.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-08-03 11:06:50 -05:00
Bruce Cherniak
9966c85e01 st/osmesa: add osmesa framebuffer iface hash table per st manager
Commit bbc29393d3 didn't include osmesa state_tracker.  This patch adds
necessary initialization.

Fixes crash in OSMesa initialization.

Created-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
2017-08-03 11:05:58 -05:00
Lionel Landwerlin
1006cd512d anv: put anv_extensions.c in gitignore
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-08-03 16:14:45 +01:00
Nicolai Hähnle
33f7d71d53 pipe-loader: fix build of dynamic pipe-drivers
v2: add libxmlconfig.la to the dynamic pipe_radeonsi driver
v3: add libxmlconfig.la to targets/opencl build
v4: add EXPAT_LIBS to opencl build
    (note: for only-opencl builds, Emil's configure.ac changes
     are also needed)

Fixes: bc7f41e11d ("gallium: add pipe_screen_config to screen_create functions")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102014
Tested-by: Andy Furniss <adf.lists@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
2017-08-03 15:40:41 +02:00
Tapani Pälli
ca6237eb4f android: anv_extensions.c is generated to libmesa_vulkan_common
Fixes build error with anv_extensions.c not found for
libmesa_anv_entrypoints.

Fixes: d62063c "anv: Autogenerate extension query and lookup"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 13:09:59 +03:00
Mauro Rossi
5baed8f0e6 android: radeonsi: prepare for driver-specific driconf options
Android build changes to avoid the following building error:

In file included from external/mesa/src/gallium/targets/dri/target.c:1:
external/mesa/src/gallium/auxiliary/target-helpers/drm_helper.h:185:10:
fatal error: 'radeonsi/si_driinfo.h' file not found
         ^
1 error generated.

Fixes: 0f8c5de869 "radeonsi: prepare for driver-specific driconf options"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 10:55:29 +01:00
Mauro Rossi
56eb2f3570 android: ac/common: always build NIR translation
Android build changes to avoid the following building error:

external/mesa/src/gallium/drivers/radeonsi/si_shader_nir.c:505:
error: undefined reference to 'ac_nir_translate'

Fixes: 86d4b46d66 "ac/common: always build NIR translation"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 10:55:27 +01:00
Samuel Pitoiset
8e103371ed mesa: only check errors when the state change in glLogicOp()
When this GL call is a no-op, it should be a little faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-03 10:56:02 +02:00
Samuel Pitoiset
39df62551c mesa: only check errors when the state change in glBlendEquationSeparateiARB()
When this GL call is a no-op, it should be a little faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-03 10:56:02 +02:00
Kenneth Graunke
6d8af9fd50 i965: Drop unnecessary I915_PARAM_HAS_EXEC_CAPTURE defines
These were only here to keep building without needing to update libdrm.
Now that we include i915_drm.h in Mesa, we don't need this - our copy
is new enough and has the #define.

Trivial.
2017-08-03 01:31:08 -07:00
Juan A. Suarez Romero
06ab6ce612 ac: add ac_shader_abi.h in distcheck
Fixes:

  CXXLD    addrlib/libamdgpu_addrlib.la
ar: `u' modifier ignored since `D' is the default (see `U')
../../../../src/amd/common/ac_nir_to_llvm.c:33:27: fatal error:
ac_shader_abi.h: No such file or directory
 #include "ac_shader_abi.h"
                           ^
compilation terminated.
Makefile:985: recipe for target
'common/common_libamd_common_la-ac_nir_to_llvm.lo' failed

When running `make distcheck`

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-08-03 09:53:09 +02:00
Dave Airlie
271fa3a684 intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.
If dual object compile fails (as seems to happen with virgl a
fair bit, and does piglit even have any tests for it?), we end up
not restarting the pull params, so we call
vec4_visitor::move_uniform_array_access_to_pull_constant
a second time and it runs over the ends of the alloc.

Fixes: tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
running inside virgl on ivybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-03 16:54:08 +10:00
Thomas Hellstrom
d5ba75f888 st/dri2 Plumb the flush_swapbuffer functionality through to dri3
Implement the state tracker manager drawable interface flush_swapbuffer
method by plumbing it through to dri3 if available.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-08-03 08:01:31 +02:00
Thomas Hellstrom
91c93dec98 gallium/st: Add a method to flush outstanding swapbuffers
Add a state tracker interface method to flush outstanding swapbuffers, and
add a call to it from the mesa state tracker during glFinish().
This doesn't strictly mean the outstanding swapbuffers have actually finished
executing but is sufficient for glFinish()
to be able to be used as a replacement for glXWaitGL().

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-08-03 08:01:25 +02:00
Thomas Hellstrom
ad5136ac82 glx/dri3: Implement the flush_swapbuffers method
Provide a dri3 implementation for the image loader extension method.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 08:00:25 +02:00
Thomas Hellstrom
ae93d534a8 dri: Add a flushSwapBuffers method to the image loader extension
This method may be used by dri drivers to make sure all outstanding
buffer swaps have been flushed to hardware.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-03 07:57:27 +02:00
Timothy Arceri
4e4042df6b gallium: introduce PIPE_CAP_MEMOBJ
This can be used to guard support for EXT_memory_object and related
extensions.

v2: update gallium docs

v3 (Timothy Arceri):
 - add cap to nv50

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-08-03 13:57:16 +10:00
Chris Wilson
fb63c43fd1 i965/blit: Remember to include miptree buffer offset in relocs
Remember to add the offset to the start of the buffer in the relocation
or else we write 0xff into random bytes elsewhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-08-02 18:06:35 -07:00
Matt Turner
858f554078 i965: Fix indentation 2017-08-02 16:49:32 -07:00
Bas Nieuwenhuizen
c9d4b571ad radv: Add suballocation for shaders.
This reduces the number of BOs that we need for the BO lists during
a submission.

Currently uses a fairly simple linear search for finding free space,
that could eventually be improved to a binary tree, which with some
per-node info could make a check for space O(1) and finding it O(log n),
in the number of buffers in that slab.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-03 00:45:13 +02:00
Jordan Justen
fe3d2559d9 docs: Add Vulkan to features.txt
To get the extension list:

$ git grep -hE "extension name=\"VK_KHR" src/vulkan/registry/vk.xml | \
  grep -v disabled | awk '{print $2}' | sed -E 's/(name=)?"//g' | sort

To find anv(il) and radv supported extensions:

$ git grep -hE "'VK_([A-Z]+)_[a-z]" src/intel/

$ git grep -hE "'VK_([A-Z]+)_[a-z]" src/amd/

v2:
 * Add radv to Vulkan 1.0 list (Bas)
 * 'started' => 'in progress'
 * Drop KHX and EXT extensions (Jason)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-08-02 14:49:47 -07:00
Kenneth Graunke
ebd2fd6ef3 i965: Set "Subslice Hashing Mode" to 16x16 on Apollolake.
As of 4.11, the kernel isn't bothering to set the subslice hashing mode
on Apollolake, leaving it at the default of 8x8.  (It initializes it to
16x4 on most platforms.)

Performance data for GPUTest Triangle on Apollolake at 1024x640:

   X-tiled RT:
   -----------
   8x8 -> 16x4:   2.4325%  +/- 0.383683% (n=107)
   8x8 -> 8x4:   -3.75105% +/- 0.592491% (n=40)
   8x8 -> 16x16:  6.17238% +/- 0.67157%  (n=30)

   Y-tiled RT:
   -----------
   8x8 -> 16x4:   1.30307%  +/- 0.297292% (n=205)
   8x8 -> 8x4:   -0.769282% +/- 0.729557% (n=35)
   8x8 -> 16x16:  3.00254%  +/- 0.715503% (n=40)

   8x MSAA RT (INTEL_FORCE_MSAA=8):
   --------------------------------
   8x8 -> 16x4:   1.38889% +/- 0.93729%  (n=7)
   8x8 -> 8x4:   -2.10643% +/- 1.15153%  (n=3)
   8x8 -> 16x16:  3.87183% +/- 1.08851%  (n=5)

Based on this, we choose 16x16 for Apollolake.

Skylake GT2 with X-tiled buffers appears to be a toss-up between 16x4
and 16x16, and with Y-tiled buffers it doesn't seem to really matter.
So we'll leave Skylake alone for now.

The hashing mode doesn't seem to make a measurable impact on more
complex benchmarks.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-08-02 13:31:56 -07:00
Dave Airlie
a60c584575 mesa/dri: drop unneeded mm.h include
This isn't used in any of these drivers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-03 06:19:45 +10:00
Dave Airlie
9e922bd78c r300: drop u_mm.h include.
This is not used in any of these files.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-03 06:19:42 +10:00
Emil Velikov
c9ec28b1c0 util: use cannonical form of ARRAY_SIZE
Namely sizeof(foo)/sizeof((foo)[0])

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-02 20:43:33 +01:00
Emil Velikov
df83213702 i965: simplify intel_image_format_lookup()
Drop the local variable and return directly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-02 20:42:21 +01:00
Emil Velikov
69fa9e91cb i965: annotate struct intel_image_format as const
Already used as such througout the code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-02 20:42:19 +01:00
Emil Velikov
31a6750988 st/dri: NULL check before deref DRI loader .getCapability
One could have vX+1 which introduces another entrypoint without
implementing older ones.

v2: Rebase, while keeping loaderPrivate

Fixes: 1bf703e4ea ("dri_interface,egl,gallium: only expose RGBA visuals
on Android")
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 20:42:19 +01:00
Eric Engestrom
dd9eb8db13 egl: check the correct function pointer
`.swap_interval` != `.SwapInterval`...

Fixes: 991ec1b81a "egl: make platform's SwapInterval() optional"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102015
Cc: Cedric Sodhi <manday@openmail.cc>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Cedric Sodhi <manday@openmail.cc>
2017-08-02 18:03:47 +01:00
Kenneth Graunke
595a47b829 i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.
The cacheline alignment restriction is on the base address; the pitch
can be anything.

Fixes assertion failures when using primus (say, on glxgears, which
creates a 300x300 linear BGRX surface with a pitch of 1200):

intel_blit.c:190: get_blit_intratile_offset_el: Assertion `mt->surf.row_pitch % 64 == 0' failed.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-08-02 10:01:34 -07:00
Tim Rowley
7cd50b9e47 swr/rast: fix core / knights split of AVX512 intrinsics
Move AVX512BW specific intrinics to be Core-only.

Move some AVX512F intrinsics back to common implementation file.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
c8fe4c13b2 swr/rast: simplify knob default value setup
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
844be91e70 swr/rast: split gen_knobs templates into .h/.cpp
Switch to a 1:1 mapping template:generated for future maintenance.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
4c5b4f3f78 swr/rast: gen_knobs template code style
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
fb3e50a351 swr/rast: switch gen_knobs.cpp license
Unintentionally added with an apache2 license; relicense to match
the rest of the tree.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
e4a6ae06cf swr/rast: fix scons gen_knobs.h dependency
Copy/paste error was duplicating a gen_knobs.cpp rule.

Fixes: 5079c277b5 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
08e3c36955 swr/rast: constify swr rasterizer
Add "const" as appropriate in method/function signatures.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
a3f97ff28b swr/rast: SIMD16 shaders - widen fetch and vertex shaders
Work in progress, disabled by default.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
39ed8e297c swr/rast: vmask() implementations for KNL
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
c18d91ca9a swr/rast: rename frontend pVertexStore
Rename to reflect global nature.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
eddbd781af swr/rast: fix movemask_ps / movemask_pd on AVX512
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
f253798205 swr/rast: stop using MSFT types in platform independent code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
030cfa8eed swr/rast: enable USE_SIMD16_FRONTEND by default
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
f8a572cdf0 swr/rast: disable AVX512 optimization of SSE / AVX code
Disable an optimization which implemented sse/avx operations on avx512
using avx512 intrinsics (to avoid switching between lane widths).

Compile with SIMD_OPT_128_AVX512 / SIMD_OPT_256_AVX512 defined to enable
these optimizations.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
d08493f9ce swr/rast: fix USE_SIMD16_FRONTEND issues
Fix problems found when enabling USE_SIMD16_FRONTEND, mostly related to
vMask / movemask_ps(pd).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
07062daae9 swr/rast: simdlib better separation of core vs knights avx512
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Tim Rowley
e1091b0861 swr/rast: threadID via portable std::this_thread::get_id()
Replace use of Win32 GetCurrentThreadId() with portable
std::this_thread::get_id().

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-08-02 11:39:33 -05:00
Jason Ekstrand
95c6a97464 spirv: Fix SpvImageFormatR16ui
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.1 17.2" <mesa-stable@lists.freedesktop.org>
2017-08-02 09:15:01 -07:00
Jason Ekstrand
277644221d anv: Advertise VK_KHR_relaxed_block_layout
There is literally no work for us to do here.  It already just works in
our driver.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand
600605e3fc anv: Bump the advertised version to 1.0.57
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand
077b200096 anv: Pull the API version from anv_extensions.py
This way everything stays in sync and we only have the one version
number.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand
0ab04ba979 anv: Use python to generate ICD json files
This is more lines of code but the python is far easier to read than the
sed expressions we were using before.  Also, this allows us to pull the
API version from anv_entrypoints.py so it never gets out-of-sync.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand
7382d8a416 anv: Add MAX_API_VERSION to anv_extensions.py
The VkVersion class is probably overkill but it makes it really easy to
compare versions in a way that's safe without the caller having to think
about patch vs. no patch.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Jason Ekstrand
a25267654b anv: Make some bits of anv_extensions module-private
This way we can use "from anv_extensions import *" in the entrypoint
generator without worrying too much about pollution

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-02 09:13:13 -07:00
Eric Engestrom
aab0649487 git_sha1_gen: catch any error the same way
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-02 14:57:54 +01:00
Tobias Klausmann
44828e99f9 build: Don't bail on OSError in git_sha1_gen.py
When building sandboxed, we may encounter additional errors. Ignore the errors,
as we are in a constrained environment.

This can be observed when building latest git with OBS.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-02 14:28:58 +01:00
Nicolai Hähnle
e749995326 st/mesa: replace st_shader_stage_to_ptarget
Use pipe_shader_type_from_mesa instead.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 14:18:52 +02:00
Samuel Pitoiset
56e3b8b9e6 mesa: add GLSL 4.60 to shading_language_version()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-02 13:36:43 +02:00
Samuel Pitoiset
c245502918 mesa: add always-false enable for GL 4.6
I believe this should be enough for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-02 13:36:41 +02:00
Samuel Pitoiset
1f4ceb8be1 glsl: recognize GLSL 4.60
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-02 13:36:39 +02:00
Thomas Hellstrom
185ef06fd2 dri3: Wait for all pending swapbuffers to be scheduled before touching the front
This implements a wait for glXWaitGL, glXCopySubBuffer, dri flush_front and
creation of fake front until all pending SwapBuffers have been committed to
hardware. Among other things this fixes piglit glx-copy-sub-buffers on dri3.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>
2017-08-02 13:29:20 +02:00
Samuel Pitoiset
dd4e817b7f mesa: add KHR_no_error support to glPolygonMode()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
1b603f0985 mesa: add polygon_mode() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
da0ecdae1d mesa: add KHR_no_error support to glClearBufferiv()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
54bd9a1d66 mesa: add clear_bufferiv() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
11e0542e5c mesa: add KHR_no_error support to glClearBufferuiv()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
b18b1fa6bc mesa: add clear_bufferuiv() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
73c5e750d7 mesa: add KHR_no_error support to glClearBufferfi()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:32 +02:00
Samuel Pitoiset
1ed61e0239 mesa: add clear_bufferi() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
5e05e7debc mesa: add KHR_no_error support to glClearBufferfv()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
33b47306e4 mesa: add clear_bufferfv() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
0127af1281 mesa: add KHR_no_error support to glClear*Buffer*Data()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
589450c4a2 mesa: add clear_buffer_sub_data_error() helper
And make clear_buffer_sub_data() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
c8191213b5 mesa: make get_texbuffer_format() global
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
1722c2498f mesa: add KHR_no_error support to glLinkProgram()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
fb3287804f mesa: add link_program() and link_program_error() helpers
And call link_program_error() from _mesa_link_program().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
fcd8ab6e86 mesa: add KHR_no_error support to glShaderSource()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
29f84556ca mesa: add shader_source() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
bece5a7ddd mesa: rename shader_source() to set_shader_source()
There is already get_shader_source(), and shader_source() will
be used for adding KHR_no_error support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
71064d34aa mesa: add KHR_no_error support to glEndConditionalRender()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
4ded964fed mesa: add end_conditional_render() render
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
9a0b203382 mesa: add KHR_no_error support to glBeginConditionalRender()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
e1750e0a17 mesa: add begin_conditional_render() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
3f05193734 mesa: add KHR_no_error support to glNamedBufferData() and glBufferData()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
7f19018cc3 mesa: add buffer_data() and buffer_data_error() helpers
And call buffer_data_error() from _mesa_buffer_data().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
5c27de1ae1 mesa: add KHR_no_error support to glLineWidth()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Samuel Pitoiset
7327cb0602 mesa: add line_width() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 12:54:31 +02:00
Nicolai Hähnle
f45efb8129 pipe-loader: fix driinfo for software and non-radeonsi drivers
Fixes: 678dadf123 ("gallium: move driinfo XML to pipe_loader")
Reviewed-by: Thomas Hellström <thellstrom@vmware.com>
2017-08-02 12:15:04 +02:00
Thomas Hellstrom
eceb671002 mesa/st: Reduce the number of frontbuffer flush calls
The mesa state tracker was needlessly flushing the front buffer even if it
hadn't been drawn to since the last flush. This was happening during
glXSwapBuffers if we at some point previously had set that frontbuffer as
a read- or draw renderbuffer, or at glFlush() or glFinish() if we at some
point previously had rendered to the front buffer. Since the frontbuffer
flush typically means a full drawable copy, it's a pretty big waste.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-08-02 11:55:35 +02:00
Nicolai Hähnle
eb88ece9e3 Fix gallium SCons build 2017-08-02 11:48:56 +02:00
Juan A. Suarez Romero
c4210dec8a glsl: look up for transform feedback varyings after linking
Check if shaders have transform feedback varyings also after the
post-link step.

This fixes:
KHR-GL45.enhanced_layouts.xfb_vertex_streams
piglit/spec/arb_enhanced_layouts/gs-stream-location-aliasing

v2: add claryfing comments (Timothy)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-02 10:04:12 +02:00
Nicolai Hähnle
53485c2d0e radeonsi: add enable_sisched driconf option
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:59 +02:00
Nicolai Hähnle
0f8c5de869 radeonsi: prepare for driver-specific driconf options
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:58 +02:00
Nicolai Hähnle
1e334a396c pipe-loader: move configuration_query into drm_helper
Having it inline is pointless anyway, since it's only called via a
function pointer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:58 +02:00
Nicolai Hähnle
b4ff5e90e9 st/dri: implement v2 of DRI_ConfigOptions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:58 +02:00
Nicolai Hähnle
aa222e21c2 pipe-loader: extract a standalone get_driver_descriptor helper function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:58 +02:00
Nicolai Hähnle
0d7d60b7ea pipe-loader: pass only the driver_name to pipe_loader_find_module
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:58 +02:00
Nicolai Hähnle
a35a9e7c6f gallium: add driconf options to pipe_screen_config
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
e794f8bf8b gallium: move loading of drirc to pipe-loader
v2: rebase compile fix: addition of mesa_no_error

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
678dadf123 gallium: move driinfo XML to pipe_loader
We will switch to the pipe_loader loading the configuration options,
so that they can be passed to the driver independently of the state
tracker.

Put the description into its own file so that it can be merged easily
with driver-specific options in future commits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
bc7f41e11d gallium: add pipe_screen_config to screen_create functions
This allows a more generic mechanism for passing user configurations
into drivers by accessing the dri options directly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
781375ac6f st/drm: add DRM_CONF_XML_OPTIONS
Allow drivers to return the XML that describes the available config
options.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
bfc26c4120 util: add merge_driinfo.py
This tool merges driinfo XML that is built using DRI_CONF_xxx macros.
The intention is to merge together state-tracker options with
driver-specific options.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:57 +02:00
Nicolai Hähnle
ddd5f2a979 glx: use v2 of DRI_ConfigOptions
Most of the change is concerned with avoiding memory leaks, since v2 of
the DRI extension returns a malloc'ed string. This also allows us to
resolve the long-standing issue of keeping drivers loaded when returning
from glXGetDriverConfig.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:56 +02:00
Nicolai Hähnle
9435b9c544 dri: define a version 2 of the DRI_ConfigOptions extension
The new function is defined to return a malloc'ed pointer. In the
following patches, this helps avoid leaking library handles when pipe
drivers are linked dynamically.

It also allows us to generate the XML string on the fly in the future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:50:56 +02:00
Nicolai Hähnle
78476cfe07 radeonsi: enable ARB_transform_feedback_overflow_query
v2: update for new cap name

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:49:09 +02:00
Nicolai Hähnle
1c5b7d5235 radeonsi: avoid redundant SET_PREDICATION packet with QBO workaround
The QBO workaround compute grid launch emits the render condition atom
when dirty, so install the render condition in the context only after
launching the compute grid. This avoids a redundant SET_PREDICATION.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:49:06 +02:00
Nicolai Hähnle
dfc1502c84 radeonsi: fix streamout overflow predication on VI+
There is a firmware regression that causes failures. Work around it by
using the compute shader for query_buffer_objects to summarize the query
results.

v2: rename to PREDICATION_OP_BOOL64 (consistent with sid.h)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:48:53 +02:00
Nicolai Hähnle
aff93fd60e gallium/radeon: implement qbo for SO_OVERFLOW_PREDICATE
v2: use R600_MAX_STREAMS instead of 4 (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:48:03 +02:00
Nicolai Hähnle
653316fb06 gallium/radeon: implement basic parts of PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE
v2: use R600_MAX_STREAMS instead of 4 (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:47:52 +02:00
Nicolai Hähnle
c8a5053252 gallium/radeon: fix render predication by SO overflow predicate
The predication bits are "visible or no overflow" and "not visible or
overflow", so we need to invert the check relative to the GL and Gallium
interface semantics.

Also, predication by the other streamout-related queries is not allowed.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:47 +02:00
Nicolai Hähnle
da83687c4b gallium/radeon: fix ARB_query_buffer_object conversion to boolean
The issue here is that the immediate is treated as a 64-bit value,
and fetching it does not work reliably with swizzles that are different
from xy and zw.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:41 +02:00
Nicolai Hähnle
d8b78bb0ee st/mesa: implement ARB_transform_feedback_overflow_query
v2: update for new cap name

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:38 +02:00
Nicolai Hähnle
877d800d60 ddebug: handle get_query_result_resource as a GPU call
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:36 +02:00
Nicolai Hähnle
f402fa371e gallium/util: add util_{str,dump}_query_value_type
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:34 +02:00
Nicolai Hähnle
aff9c54125 gallium: add util_dump_query_type and use it in ddebug
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:32 +02:00
Nicolai Hähnle
16923e42a4 gallium: rename util_dump_* to util_str_* for enum-to-string conversion
This is mostly mechanical search-and-replace, plus touching up the
macros in u_dump_defines.c manually a bit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:46:24 +02:00
Nicolai Hähnle
a677799e51 gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE and corresponding cap
v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit
    in the documentation

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 09:37:10 +02:00
Tapani Pälli
f444ac5e60 android: export intermediates from libmesa_util
Fixes following build issues:

   In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/common/dri_util.c:45:
   vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found
   ...
   In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/i965/intel_screen.c:44:
   vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found

Fixes: 601093f9 (xmlconfig: move into src/util)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
2017-08-02 10:32:48 +03:00
Tapani Pälli
99c764b647 intel: move gen_decoder.* back to COMMON_FILES
this change reverts commit 4f695731, we want to be able to build
with -DDEBUG and gen_decoder on Android.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-02 10:31:13 +03:00
Tapani Pälli
9bd15da85d android: link libmesa_intel_common with zlib and expat
Makes it possible to build Mesa on Android with -DDEBUG with
the next patch that reverts 4f695731.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-02 10:30:50 +03:00
Bas Nieuwenhuizen
341578a6ae ac/nir: Add float cast before shadow comparator clamp.
LLVM complained about passing an i32 to a float clamp.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 0f9e32519b "ac/nir: clamp shadow texture comparison value on VI"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 08:43:13 +02:00
Chris Wilson
f28c2e2256 i965: Check result of make_surface() for intel_miptree_create_for_bo
Since make_surface() can fail, if the format isn't support by hw or
simlar error, we need to check the result before dereferencing it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-08-01 21:28:09 -07:00
Dave Airlie
246690b683 virgl: add BPTC support.
This just adds the guest checks for BPTC, the host renderer
also needs code to support these.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-02 13:54:38 +10:00
Timothy Arceri
06237fc9e1 mesa/st: fix conditional jump depends on uninitialised value
Reported by valgrind at:
glsl_to_tgsi_visitor::visit(ir_expression*) (st_glsl_to_tgsi.cpp:1560)

When compiling the Deus Ex shaders.

Fixes: 28a5e7104 ("st/glsl_to_tgsi: handle precise modifier")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-08-02 12:55:42 +10:00
Dave Airlie
cb6f16dce9 radeon/ac: use ds_swizzle for derivs on si/cik.
This looks like it's supported since llvm 3.9 at least,
so switch over radeonsi and radv to using it, -pro also
uses this. We can now drop creating lds for these operations
as the ds_swizzle operation doesn't actually write to lds at all.

Acked-by: Marek Olšák <marek.olsak@amd.com>
(stable requested due to fixing radv CIK conformance tests)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-02 00:12:01 +01:00
Jason Ekstrand
35338a242b vulkan: Import in the latest 1.0.57 header and XML from Khronos
Acked-by: Dave Airlie <airlied@redhat.com>
2017-08-01 13:27:12 -07:00
Connor Abbott
ddd9e11795 ac/nir: fix nir_op_unpack_64_2x32_split_y emission
This was broken thanks to a typo in b2367cf.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 12:20:49 -07:00
Connor Abbott
6d731c5651 ac/nir: fix lsb emission
This makes it match radeonsi. The LLVM backend itself will emit the
correct instruction, but LLVM might do incorrect optimizations since it
thinks the output is undefined when the input is 0, even though it's not
supposed to be. We really need a new intrinsic, or for the backend to
become smarter and recognize this pattern.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
2017-08-01 12:20:49 -07:00
Connor Abbott
de91461575 nir: fix algebraic optimizations
The optimizations are only valid for 32-bit integers. They were
mistakenly firing for 64-bit integers as well.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-08-01 12:20:49 -07:00
Jason Ekstrand
d62063ce31 anv: Autogenerate extension query and lookup
As time goes on, extension advertising is going to get more complex.
Today, we either implement an extension or we don't.  However, in the
future, whether or not we advertise an extension will depend on kernel
or hardware features.  This commit introduces a python codegen framework
that generates the anv_EnumerateFooExtensionProperties functions as well
as a pair of anv_foo_extension_supported functions for querying for the
support of a given extension string.  Each extension has an "enable"
predicate that is any valid C expression.  For device extensions, the
physical device is available as "device" so the expression could be
something such as "device->has_kernel_feature".  For instance
extensions, the only option is VK_USE_PLATFORM defines.

This mechanism also means that we have a single one-line-per-entry table
for all extension declarations instead of the two tables we had in
anv_device.c and the one we had in anv_entrypoints_gen.py.  The Python
code is smart and uses the XML to determine whether an extension is an
instance extension or device extension.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-01 11:12:41 -07:00
Jason Ekstrand
ddc86c1d0e anv: Add a new centralized extensions file
This will allow us to keep everything in one place when it comes to
declaring what extensions are supported.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-08-01 11:12:41 -07:00
Gwan-gyeong Mun
fe2a6281b3 egl/drm: Fix misused x and y offsets in swrast_get_image()
It fixes misused x and y variables on the calculation of the memory copy regions.

Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

[Eric: use gbm_bo_get_bpp() instead of local function, split clamp patch]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-01 18:37:58 +01:00
Gwan-gyeong Mun
3a5e3aa5a5 egl/drm: Fix misused x and y offsets in swrast_put_image2()
It fixes misused x and y variables on the calculation of the memory copy regions.

Cc: Giovanni Campagna <gcampagna@src.gnome.org>
Fixes: 8430af5ebe "Add support for swrast to the DRM EGL platform"
Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

[Eric: use gbm_bo_get_bpp() instead of local function, split clamp patch]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-01 18:37:58 +01:00
Eric Engestrom
04a40f7d2a gbm: add gbm_bo_get_bpp()
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-01 18:36:23 +01:00
Scott D Phillips
3db05ed1d1 gles: Restore some lost typedefs
GLES/gl.h has historically provided some typedefs that are not
used in the API itself. Restore these typedefs that were lost to
avoid breaking applications.

These seem to be the only typedefs removed in the update.

Fixes: 7fd0817 "Update Khronos-supplied headers"

[Eric: added a big warning to revert this patch when pulling the updated header]
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-01 18:26:15 +01:00
Eric Engestrom
e7fb7fd4ea egl: remove unnecessary empty array element
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-01 17:43:15 +01:00
Eric Engestrom
c3b223f48f egl: split enums to make use of -Wswitch
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-01 17:43:11 +01:00
Eric Engestrom
270a1c7110 egl: use designated initaliser for _eglGlobal
Turn comments into actual code, that the compiler can check for us :)
(Speaking of, one of the comments had a typo. Challenge: find it)

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-01 17:43:06 +01:00
Eric Engestrom
991ec1b81a egl: make platform's SwapInterval() optional
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-01 17:36:57 +01:00
Eric Engestrom
97eadb07e7 loader: remove clamp_swap_interval()
As of last commit, no invalid swap interval can be stored, so there's
no need to sanitize the values when reading them anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-01 17:36:57 +01:00
Eric Engestrom
2714a8f3e9 egl: deduplicate swap interval clamping logic
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-08-01 17:36:57 +01:00
Juan A. Suarez Romero
54826331b3 glsl: xfb_stride applies to buffers, not block members
When we have an interface block like:

layout (xfb_buffer = 0, xfb_offset = 0) out Block {
                             vec4 var1;
    layout (xfb_stride = 48) vec4 var2;
                             vec4 var3;
};

According to ARB_enhanced_layouts spec:

   "The *xfb_stride* qualifier specifies how many bytes are consumed by
    each captured vertex.  It applies to the transform feedback buffer
    for that declaration, whether it is inherited or explicitly
    declared. It can be applied to variables, blocks, block members, or
    just the qualifier out. [ ...] While *xfb_stride* can be declared
    multiple times for the same buffer, it is a compile-time or
    link-time error to have different values specified for the stride
    for the same buffer."

This means xfb_stride actually applies to the buffer, and not to the
individual components.

In the above example, it means that var2 consumes 16 bytes, and var3 is
at offset 32.

This has been confirmed also by John Kessenich, the main contact for the
ARB_enhanced_layouts specs, and also because this commit fixes:

GL45.enhanced_layouts.xfb_block_member_stride

This commit is in practice a revert of 598790e856 (glsl: apply
xfb_stride to implicit offsets for ifc block members).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 15:58:24 +00:00
Jose Fonseca
d4b4478390 build: Convert git_sha1_gen script to Python (part2).
Things pointed out by Emil.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-08-01 16:33:55 +01:00
Marek Olšák
39608761cd st/dri: don't set PIPE_BIND_SHARED for privately-allocated renderbuffers
which are MSAA and depth/stencil buffers.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-08-01 17:06:38 +02:00
Marek Olšák
cb8ecb2f36 radeonsi: don't print AMD twice in the renderer string with the marketing name
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák
1aeafb59e6 radeonsi: print CE IBs into ddebug reports
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák
1482861abe radeonsi: fix printing vertex buffer descriptors into ddebug reports
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák
404f524fe2 radeonsi: don't flush sL1 conditionally in WAIT_ON_CE_COUNTER
I don't know the condition for the flush, but we better turn this off.
The sL1 flush is used when CE dumps stuff into a ring buffer and the ring
buffer wraps.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák
94965b8219 radeonsi: set up HTILE in descriptors only when level 0 is accessible
Compression isn't enabled with non-zero levels.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Marek Olšák
b9fc9d3f24 radeonsi: fix various CLEAR_STATE issues
Fixes: 064550238e ("radeonsi: use CLEAR_STATE to initialize some
                      registers")
Bugzilla: https://bugs.freedesktop.org/101969
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-01 17:06:38 +02:00
Jose Fonseca
b99dcbfeb3 build: Convert git_sha1_gen script to Python.
Python is the scripting language we've been using for scripts that need
to run across all supported platforms.

Shell is *not* a portable language for scripts.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-01 15:24:39 +01:00
Nicolai Hähnle
1bc8b2c0eb Fix SCons build
Fixes: 601093f95d ("xmlconfig: move into src/util")
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Roland Scheidegger <sroland@vmware.com>
2017-08-01 13:52:59 +02:00
Samuel Pitoiset
af45b8159c mesa: fix bad cast conversions in viewport()
Fixes: ddc32537d6 ("mesa: clamp viewport values only once when using glViewport()")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101981
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101989
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-08-01 09:50:14 +02:00
Kenneth Graunke
5281e4ed3b i965/drm: Inline brw_bo_references.
It's a single atomic add, so it makes sense to inline it.

Improves performance in Piglit's drawoverhead microbenchmark's
"DrawArrays ( 1 VBO, 0 UBO,  0    ) w/ no state change" subtest by
0.400922% +/- 0.310389% (n=350) on my i7-7700HQ.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 23:19:34 -07:00
Dave Airlie
b8bea9a050 Revert "st_glsl_to_tgsi: rewrite rename registers to use array fully."
This reverts commit 3008161d28,
which caused a regression for VMWare.

The initial code had some recursion in it, that I removed by accident
trying to add back the recursion broke lots of things, take the high
road and revert for now.

Fixes: 3008161d (st_glsl_to_tgsi: rewrite rename registers to use array fully.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-01 03:43:29 +01:00
Dave Airlie
df61a05019 radv: handle 10-bit format clamping workaround.
This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.*
for a2r10g10b10 formats as destination on SI/CIK hardware.

This adds support to the meta program for emitting 10-bit
outputs, and adds 10-bit support to the fragment shader key.

It also only does the int8/10 on SI/CIK.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-01 00:10:23 +01:00
Bas Nieuwenhuizen
b7dd86a04e gallium/targets: Fix d3dadapter9 build after xmlconfig move.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 601093f95d "xmlconfig: move into src/util"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-31 22:50:13 +02:00
Bas Nieuwenhuizen
8229706ad8 radv: Don't underflow non-visible VRAM size.
In some APU situations the reported visible size can be larger than
VRAM size. This properly clamps the value.

Surprisingly both CTS and spec seem to allow a heap type with size 0,
so this seemed like the easiest option to me.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 4ae84efbc5 "radv: Use enum for memory heaps."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-07-31 22:50:13 +02:00
Rob Herring
d0540d5b85 Android: fix xmlconfig build
Commit 601093f95d ("xmlconfig: move into src/util") broke the Android
build due to missing libexpat dependency:

external/mesa3d/src/util/xmlconfig.c:34:10: fatal error: 'expat.h' file not found

Fixes: 601093f95d ("xmlconfig: move into src/util")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-31 15:48:33 -05:00
Adam Jackson
d4ca66a159 docs: Update feature list for GL 4.6
ARB_polygon_offset_clamp and ARB_texture_filter_anisotropic look like
they'd be pretty trivial to wire up.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-07-31 16:35:17 -04:00
Eric Engestrom
70c6f656f9 util/ra: fix memory leak
CID: 1415909
Fixes: 7a34a0e890 "ra: Add a callback for selecting a register
                             from what's available."
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 12:55:19 -07:00
Samuel Pitoiset
110dda0e3f mesa: drop unnecessary GLAPIENTRY to _mesa_init_line()
Noticed randomly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-31 19:16:30 +02:00
Samuel Pitoiset
58acc32a5e mesa: only check errors when the state change in glClipControl()
When this GL call is a no-op, it should be a little faster in
the errors path only.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 19:08:44 +02:00
Samuel Pitoiset
56bea2a266 mesa: only check errors when the state change in glPointSize()
When this GL call is a no-op, it should be a little faster in
the errors path only.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 19:08:44 +02:00
Samuel Pitoiset
c6ba702979 mesa: only check errors when the state change in glCullFace()
When this GL call is a no-op, it should be a little faster in
the errors path only.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 19:08:44 +02:00
Samuel Pitoiset
c787477378 mesa: only check errors when the state change in glProvokingVertex()
When this GL call is a no-op, it should be a little faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 19:08:44 +02:00
Marek Olšák
6d37bcdb79 dri_interface: document loaderPrivate for getCapability
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-31 18:30:00 +02:00
Nicolai Hähnle
86d4b46d66 ac/common: always build NIR translation
radeonsi needs it now, and we require LLVM 3.9 anyway.

Fixes a build with radeonsi but not radv.
2017-07-31 17:59:10 +02:00
Rob Herring
be5773fa8d Android: fix compile error for DRI2 loader getCapability
Fix compile failure from commit 1bf703e4ea ("dri_interface,egl,gallium:
only expose RGBA visuals on Android").

Fixes: 1bf703e4ea ("dri_interface,egl,gallium: only expose RGBA visuals on Android")
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-31 10:33:15 -05:00
Nicolai Hähnle
90c8f17cf8 Attempt to fix AppVeyor build, round 2 2017-07-31 17:19:13 +02:00
Marek Olšák
d85802e501 Revert "st/mesa: release sampler views when redefining a texture in st_context_teximage"
This reverts commit 5c1241268b.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101961

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
2017-07-31 17:11:30 +02:00
Nicolai Hähnle
49bdb73bec Attempt to fix the AppVeyor build 2017-07-31 17:04:30 +02:00
Nicolai Hähnle
601093f95d xmlconfig: move into src/util
v2: attempt to fix Android build (Emil)

v3: add missing include path

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-07-31 15:38:41 +02:00
Nicolai Hähnle
1e40d2c882 xmlconfig: remove GL type dependencies
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 15:37:51 +02:00
Nicolai Hähnle
2879a602dd radeonsi: ensure that temp array allocas are in the entry block
Otherwise, code generation fails. This has become necessary since some
shaders are wrapped in control flow.

Fixes: 081ac6e5c6 ("radeonsi/gfx9: always wrap GS and TCS in an if-block (v2)")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 15:00:22 +02:00
Nicolai Hähnle
dfe237aef9 radeonsi: enable R600_DEBUG=nir for vertex and fragment shaders
Also, disable geometry and tessellation shaders. Mixing and matching NIR
and TGSI shaders should work (and I've tested it for the VS/PS interface),
but geometry and tessellation requires VS-as-ES/LS, which isn't implemented
yet for NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:45 +02:00
Nicolai Hähnle
3b4f481c60 radeonsi: VS as ES/LS are not yet supported with R600_DEBUG=nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:44 +02:00
Nicolai Hähnle
3997b10f74 radeonsi/nir: lower uniforms to UBO loads
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:44 +02:00
Nicolai Hähnle
b7d36efc2d ac/nir: implement load_frag_coord intrinsic
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:44 +02:00
Nicolai Hähnle
d5741489d3 radeonsi/nir: lower txp instructions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:44 +02:00
Nicolai Hähnle
bcf85fcd9a ac/nir: pass ac_llvm_context to unpack_param
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:44 +02:00
Nicolai Hähnle
1c64637c26 ac/nir,radeonsi: add and use ac_shader_abi::frag_pos
v2: update for LLVMValueRefs in ac_shader_abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Nicolai Hähnle
f03c54e05a ac/nir,radeonsi: add and use ac_shader_abi::{ancillary,sample_coverage}
v2: update for LLVMValueRefs in ac_shader_abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Nicolai Hähnle
7de445377c ac/nir,radv: move force_persample to ac_shader_info::force_persample
Avoid accessing radv-specific structures during the meat of NIR-to-LLVM
translation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Nicolai Hähnle
25ff22e390 radeonsi: tweak next-shader assumptions when streamout is used
VS with streamout is always a HW VS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Nicolai Hähnle
a69afb68c9 radeonsi: use new function ac_build_umin for edgeflag clamping
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:42 +02:00
Nicolai Hähnle
0f9e32519b ac/nir: clamp shadow texture comparison value on VI
Needed for TC-compatible HTILE in radeonsi for test cases like
piglit spec/arb_texture_rg/execution/fs-shadow2d-red-01.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:42 +02:00
Nicolai Hähnle
ac2ab5acad ac/nir: add always_vector argument to ac_build_gather_values_extended
This simplifies a bunch of places that no longer need special treatment
of value_count == 1. We rely on LLVM to optimize away the 1-element vector
types.

This fixes a bunch of bugs where 1-element arrays are indexed indirectly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:42 +02:00
Nicolai Hähnle
e247357240 ac/nir,radeonsi: add ac_shader_abi::front_face
v2: update for LLVMValueRefs in ac_shader_abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:42 +02:00
Nicolai Hähnle
28634ff7d3 ac/nir: pass ac_nir_context to emit_ddxy
Allocating the ddxy_lds is considered to be part of the API shader
translation and not part of the ABI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
c5f3912e13 ac/nir: pass ac_nir_context to SSBO intrinsic handlers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
a0af3daf9c radeonsi: implement and use ac_shader_abi::load_ssbo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
d46018a4d7 radeonsi: make get_indirect_index globally visible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:41 +02:00
Nicolai Hähnle
b78eae6f2a ac/nir: load buffer descriptors via ac_shader_abi::load_ssbo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
aa66fec47e ac/nir: pass ac_nir_context to emit_discard_if
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
4ba201ee36 ac/nir: extract shader_info->fs.can_discard from NIR shader info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
41d4016e06 radeonsi/nir: perform radeonsi-specific lowering and optimization passes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
b49c2c9fa3 radeonsi/nir: perform lowering of input/output driver locations
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:40 +02:00
Nicolai Hähnle
9061dca872 ac/nir: handle old-style shadow tex instructions correctly
The first element is only extracted for new-style shadow tex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
07597632a5 ac/nir: whitespace fixes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
ba06e8bbe8 ac/nir: use shader_info pass to determine whether instance_id is used
This improves the separation of ABI and NIR translation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
be0488a173 ac/nir: move setting shader_info->fs.writes_memory to radv-specific code
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
8d23575c96 radeonsi/nir: add image descriptor loading
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:39 +02:00
Nicolai Hähnle
f37f9aed84 ac/nir: add image and write parameter to ac_shader_abi::load_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
b36b6f76fa ac/nir: add support for arrays-of-arrays to get_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
677bd47cb9 radeonsi/nir: set si_shader_context::num_{sampler,images}
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
7c27ef182c radeonsi/nir: implement ac_shader_abi::load_sampler_desc
v2: remove enum desc_type from radeonsi (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:38 +02:00
Nicolai Hähnle
35b7b3a80f ac/nir: pass ac_nir_context to tex_fetch_ptrs and related functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
6ff5317589 ac/nir: add and use ac_shader_abi::load_sampler_desc
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
57fbf3f9eb ac/nir: pass ac_nir_context to visit_tex and various related functions
Get most of the churn out of the way before actually loading samplers
via the ABI.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
7763c7b2ba ac/nir,radeonsi: add ac_shader_abi::chip_class
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
a6f597536d radeonsi/nir: emit FS outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:37 +02:00
Nicolai Hähnle
c41a8e2ad9 radeonsi/nir: load FS inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:36 +02:00
Nicolai Hähnle
8643d41622 radeonsi/nir: load VS inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:36 +02:00
Nicolai Hähnle
d007919d99 ac/nir,radeonsi: add ac_shader_abi::load_ubo
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:36 +02:00
Nicolai Hähnle
220ed150bc ac/nir: pass ac_nir_context to visit_load_ubo_buffer
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
df62e5eed0 ac/nir: pass ac_nir_context to visit_{load,store}_var and get_deref_offset helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
e139705c98 ac/nir: pass ac_llvm_context to some helper functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
cb96a36b04 ac/nir: pass ac_nir_context to visit_intrinsic
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
48737e1890 ac/nir: add ac_nir_context::main_function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:35 +02:00
Nicolai Hähnle
2be774b196 ac/nir: split scanning outputs from setting up output allocas
The scanning phase sets the driver_location, because it is part of the
ABI: radeonsi does the assignment differently.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
1a508cf8d6 ac/nir: pass ac_llvm_context to *build_alloca* helpers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
b99a169869 ac/nir: use ac_shader_abi::emit_outputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
0c3b6a4bd9 ac,radeonsi: add ac_shader_abi::emit_outputs for hardware VS shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:34 +02:00
Nicolai Hähnle
1ea972e08a radeonsi: pass si_shader_context to get_primitive_id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:33 +02:00
Nicolai Hähnle
9df23db13d radeonsi: translate NIR to LLVM
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:33 +02:00
Nicolai Hähnle
d77526ee30 radeonsi: dump NIR instead of TGSI when appropriate
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:33 +02:00
Nicolai Hähnle
c5f70a5174 radeonsi: bypass the shader cache for NIR shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:33 +02:00
Nicolai Hähnle
29d7bdd179 radeonsi: scan NIR shaders to obtain required info
v2: set num_instruction to 2, i.e. 1 + END (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
73c7e92d3a ac/nir: add ac_shader_abi::inputs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
b2367cfcc7 ac/nir: begin splitting off ac_nir_context
The eventual goal is to hide all radv-specific details behind
ac_nir_context::abi, so that the NIR->LLVM code can be re-used by
radeonsi.

During development, we live with a partial split, where some of the
NIR->LLVM code still relies on linking back to the nir_to_llvm_context
(which should ultimately be renamed to reflect that it's radv-specific).
The idea is to get rid of these backlinks over time.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
90b3ba8970 radeonsi: add si_shader_selector::nir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:32 +02:00
Nicolai Hähnle
acd09389cb radeonsi: implement pipe_screen::get_compiler_options for NIR
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
da62a31c5b radeonsi: add nir include paths
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
fa5ae8db2e ac/nir: start using ac_shader_abi
v2: update for LLVMValueRefs in ac_shader_abi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
61ad2f13c3 ac,radeonsi: move some VS input descriptions to ac_shader_abi
v2: use LLVM values instead of function parameter indices

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
c7e9ebb3ab radeonsi: store shader function arguments in a structure
Aligns the code a bit more with ac/nir, and simplifies the setup of
ac_shader_abi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:31 +02:00
Nicolai Hähnle
00476907fc gallium/targets: link against NIR when building radeonsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:30 +02:00
Nicolai Hähnle
e044e9eb2a st/glsl_to_nir: move nir_lower_io to drivers
This allows drivers more freedom in how exactly they want to lower I/O,
e.g. first lowering I/O to temporaries.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:30 +02:00
Nicolai Hähnle
c5f97eab09 st/mesa: get rid of st_glsl_types
It's a duplicate of glsl_type::count_attribute_slots.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:30 +02:00
Nicolai Hähnle
2cf8c84619 st/glsl_to_nir: use nir_lower_samplers_as_deref when requested by the driver
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
06d038c4bd st/glsl_to_nir: fix the case where NIR clone testing is enabled
In that case, prog->nir must be assigned at the end.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
01f1598a40 gallium: add PIPE_CAP_NIR_SAMPLERS_AS_DEREF
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
e902ac3268 nir: add nir_lower_uniforms_to_ubo pass
This is a further lowering of default-block uniform loads that transforms
load_uniform intrinsics into load_ubo intrinsics. This simplifies the rest
of the backend.

v2: transform from load_uniform instead of straight from variables

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
bce6f99875 nir: add nir_lower_samplers_as_deref pass
This pass is a replacement for the nir_lower_samplers pass, which has the
advantage of keeping sampler references as derefs. This allows a unified
treatment of texture instructions and image intrinsics in the backend.
2017-07-31 14:55:29 +02:00
Nicolai Hähnle
f1da97ef7a nir: add load_frag_coord system value intrinsic
Some drivers prefer to treat gl_FragCoord as a system value rather than
a fragment shader input, see Const.GLSLFragCoordIsSysVal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Nicolai Hähnle
5011923e09 nir: fix nir_lower_wpos_ytransform when gl_FragCoord is a system value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Nicolai Hähnle
b27c2d402e nir: add nir_instr_rewrite_deref
Allows modifying a texture instruction's texture and sampler derefs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-31 14:55:28 +02:00
Samuel Pitoiset
540b1a8f0b mesa: add KHR_no_error support to glPointSize()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
1693ab6c3a mesa: add point_size() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
d88d60ab1d mesa: add KHR_no_error support to glVertexArrayElementBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
1429b5cd59 mesa: add vertex_array_element_buffer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
80a845538a mesa: add KHR_no_error support to glTextureSubImage*D()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
de0b1e5a81 mesa: add texturesubimage_error() helper
And make texturesubimage() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
ffe8813b02 mesa: add KHR_no_error support to glDetachShader() and glDetachObjectARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
6b9087a45d mesa: add detach_shader_error() helper
And make detach_shader() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
c8ea792723 mesa: add KHR_no_error support to glDrawTransformFeedback*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
a187fcf584 mesa: add KHR_no_error support to glNamedFramebufferDrawBuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
9337e4d38a mesa: add KHR_no_error support to glDrawBuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
966108a803 mesa: add draw_buffers_error() helper
And make draw_buffers() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
1dd2003396 mesa: add KHR_no_error support to glDeleteBuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:40 +02:00
Samuel Pitoiset
fc039e9ff4 mesa: add delete_buffers() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
7bc50dfe79 mesa: add KHR_no_error support to glNamedFramebufferRenderbuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
0a20e43ff0 mesa: add KHR_no_error support to glFramebufferRenderbuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
55188f7db8 mesa: add framebuffer_renderbuffer_error() helper
And make framebuffer_renderbuffer() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
ad427f54aa mesa: add KHR_no_error support to glDeleteTextures()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
488af1292a mesa: add delete_textures() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
8bf786d13d mesa: add KHR_no_error support to glNamedFramebufferDrawBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
0238976665 mesa: add KHR_no_error support to glDrawBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
e42775ba68 mesa: add draw_buffer_error() helper
And make draw_buffer() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
d952485d7c mesa: add KHR_no_error support to glBindTextures()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
90f691b5be mesa: add bind_textures() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
9f1fab9533 mesa: add KHR_no_error support to glBindTexture()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
cba013d423 mesa: add bind_texture() helper
For KHR_no_error support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
a77768bf60 mesa: rename bind_texture() to bind_texture_object()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
a43ac5e7c4 mesa: add KHR_no_error support to glMemoryBarrierByRegion()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
4de0033d73 mesa: add memory_barrier_by_region() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
d21ae02fb2 mesa: add KHR_no_error support to glMultiDrawArrays()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
e98cb5bbca mesa: add KHR_no_error support to glMinSampleShading()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
bd805a3c31 mesa: add min_sample_shading() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
d5d01193f4 mesa: add KHR_no_error support to glBlendEquationSeparate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
4141fab9ed mesa: add blend_equation_separate() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
6f6c5a1ad7 mesa: add KHR_no_error support to glPrimitiveRestartIndex()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
de4e5b4dac mesa: add primitive_restart_index() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
bc1c45d0ed mesa: add KHR_no_error support to glGenerate*Mipmap()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
15603acfd9 mesa: add generate_texture_mipmap_error() helper
And make generate_texture_mipmap() always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
ef0b038981 mesa: add KHR_no_error support to glDeleteSamplers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
3339b53755 mesa: add delete_samplers() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
c566ba29b6 mesa: add KHR_no_error to glDeleteVertexArrays()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
ef22651d7e mesa: add delete_vertex_arrays() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
4b0a33d233 mesa: add KHR_no_error to glBindVertexArray()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
81fa33171d mesa: add bind_vertex_array() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
c88649246f mesa: add KHR_no_error support to glInvalidate*()
These are just no-op because we don't actually do anything
useful in the errors path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
fee507b909 mesa: add KHR_no_error support to glRead*Pixels*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
333663f601 mesa: add read_pixels() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
f60b16ef27 mesa: add KHR_no_error support to glMultiDraw*Indirect*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
eb6b299720 mesa: add KHR_no_error support to glMultiDrawElementsBaseVertex()
Just skip validation when no_error is enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
0e69fc92e2 mesa: add KHR_no_error support to glVertexArrayBindingDivisor()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
a466b74241 mesa: add KHR_no_error support to glVertexBindingDivisor()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
6f4741e32e mesa: add KHR_no_error support to gl{Create,Gen}VertexArrays()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
ab0f246672 mesa: add gen_vertex_arrays_err() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
3e637918ec mesa: add KHR_no_error support to glTextureStorage*D()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
bc38214d76 mesa: rename texturestorage() to texturestorage_error()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
8ca88da368 mesa: add KHR_no_error support to glTexStorage*D()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
14f1613c6f mesa: rename texstorage() to texstorage_error()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
30c36ff335 mesa: add texture_storage_error() helper
And make texture_storage always inline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
33f2b45e26 mesa: add KHR_no_error support to glBindSampler()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
6fd8255c2a mesa: add bind_sampler() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
18bd5d2c8c mesa: add KHR_no_error support to glBindSamplers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
698ae2f0ef mesa: add bind_samplers() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
2af699a2c7 mesa: add KHR_no_error support to glProgramParameteri()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
c95bf616a8 mesa: add program_parameteri() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
667a6e8122 mesa: add KHR_no_error support to glDeleteSync()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
81f3a6b29a mesa: add delete_sync() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
8981f90091 mesa: add KHR_no_error support to glWaitSync()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
c6f81a1df8 mesa: add wait_sync() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
2a4d5dce74 mesa: add KHR_no_error support to glTextureView()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
68c43ae8b2 mesa: add texture_view() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
789cc87063 mesa: add KHR_no_error support to glPatchParameteri()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
905ad0d1dd mesa: add KHR_no_error support to glBlendEquationiARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
474f4b343b mesa: add blend_equationi() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
6c15260ecc mesa: add KHR_no_error support to glSampleMaski()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
5f51c970a9 mesa: add sample_maski() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
e4b170e4f0 mesa: add KHR_no_error support to glDepthRangeArrayv
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Samuel Pitoiset
999f2de9a8 mesa: add depth_range_arrayv() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-31 13:53:39 +02:00
Marek Olšák
1bf703e4ea dri_interface,egl,gallium: only expose RGBA visuals on Android
X/GLX can't handle them. This removes almost 500 GLX visuals that were
incorrectly exposed.

Add an optional getCapability callback for querying what the loader can do.

I'm not splitting this patch, because it's already too small.

v2: also add the callback to __DRIimageLoaderExtension

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
2017-07-31 12:49:30 +02:00
Marek Olšák
5d8359ff4d radeonsi: expose MRT-draw-calls to HUD
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 12:46:43 +02:00
Samuel Pitoiset
ddc32537d6 mesa: clamp viewport values only once when using glViewport()
It's useless to clamp the same values for all viewports.

+7% in the "viewport change" test (drawoverhead benchmark).

v2: - call clamp_viewport() in all callers of set_viewport_no_notify()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-07-31 12:14:10 +02:00
Samuel Pitoiset
58473f8b87 mesa: make _mesa_check_init_viewport() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-31 12:14:07 +02:00
Kenneth Graunke
7ea4cda2ab gallium: Fix Thomas's email address
Commit 8771285054 misspelled Hellstrom.
2017-07-28 13:41:51 -07:00
Kenneth Graunke
19c90481d4 i965: s/Tungsten Graphics/VMware/ in brw_bufmgr.c.
In commit 8771285054, José replaced the
Tungsten Graphics copyright notices with VMware, as Tungsten is gone.

I later imported brw_bufmgr.c, reintroducing a Tungsten copyright.
This commit does the equivalent of José's change to the new file.
2017-07-28 13:39:22 -07:00
Kenneth Graunke
1386a8fd13 i965: Reformat the copyright header in brw_bufmgr.c
This reformats the copyright header to match what we use in most of the
newer parts of the driver.  There are a few minor alterations: we change
"COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS" to the standard
"AUTHORS OR COPYRIGHT HOLDERS", and move the permission notice to the
proper place (it should be in the middle, so "next paragraph" actually
refers to something).

Both of these changes match the OSI's MIT License text:
https://opensource.org/licenses/MIT

I copied this from genX_state_upload.c.
2017-07-28 13:35:30 -07:00
Marek Olšák
f4d095cc65 radeonsi: update dirty_level_mask only when flushing or unbinding framebuffer
This fixes corruption with bindless textures in Dawn Of War 3.

The do_update_surf_dirtiness mechanism was complicated and dirty_level_mask
was only updated after the first draw call. The problem is bindless textures
are checked for decompression every draw call and we would only decompress
after the first draw call. The solution is to set dirtiness after the last
draw call to the framebuffer, so the (unconditional) decompression of
bindless textures happens at the right time.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-28 16:34:24 +02:00
Marek Olšák
221fdae38b Revert "drirc: whitelist glthread for Mount and Blade Warband"
This reverts commit a7617a49fb.

glthread disables itself automatically and therefore has no effect
on the game.
2017-07-28 16:34:24 +02:00
Samuel Pitoiset
3f38e64270 st/mesa: remove useless st_bufferobj_validate_usage()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-28 11:22:40 +02:00
Samuel Pitoiset
8971513e90 st/mesa: remove st_cache.h
It contains unused prototypes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-28 11:22:21 +02:00
Samuel Pitoiset
f99e9335e2 st/glsl_to_tgsi: fix getting the image type for array of structs
Since array splitting for AoA is disabled, we have to retrieve
the type of the first non-array type when an array of images is
declared inside a structure. Otherwise, it will hit an assert
in glsl_type::sampler_index() because it expects either a sampler
or an image type.

This fixes a regression in the following piglit test:
arb_bindless_texture/compiler/images/arrays-of-struct.frag

Fixes: 57165f2ef8 ("glsl: disable array splitting for AoA")
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-28 11:20:43 +02:00
Samuel Pitoiset
b38c9c57f2 mesa: fix mismatch when returning 64-bit bindless uniform handles
The slower convert-and-copy process performs a bad conversion
because it converts the value to signed 64-bit integer, but
bindless uniform handles are considered unsigned 64-bit.

This fixes "Check glUniform*() with mixed texture units/handles"
from arb_bindless_texture-uniform piglit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-28 11:20:39 +02:00
Samuel Pitoiset
e0e79f0b08 mesa: remove gl_sync_object::Type field
This is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-28 11:20:37 +02:00
Samuel Pitoiset
ca4d1def39 mesa: drop fence type parameter from NewSyncObject()
This is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-28 11:20:34 +02:00
Marek Olšák
28c7fbbe0f radeonsi: rely on CLEAR_STATE for clearing UCP and blend color registers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Marek Olšák
7c721b28f6 radeonsi: rely on CLEAR_STATE for resetting the framebuffer and sample mask
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Marek Olšák
064550238e radeonsi: use CLEAR_STATE to initialize some registers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Marek Olšák
5c1241268b st/mesa: release sampler views when redefining a texture in st_context_teximage
Noticed randomly.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-28 08:03:24 +02:00
Dave Airlie
800d162209 radv: for stencil only set Z tile mode index to same value
On SI this was causing a hang in
dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint

This was due to not handling the tile mode index for depth like
I fixed previously for new GPUs.

Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-28 04:12:32 +01:00
Dave Airlie
554aa09440 virgl: drop precise modifier.
The host doesn't understand this yet, so drop it for now.

Fixes: virgl regressions.

Fixes: af22adee4f (tgsi: add precise flag to tgsi_instruction)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-28 11:04:35 +10:00
Marek Olšák
7257c171e9 st/mesa: always unconditionally revalidate main framebuffer after SwapBuffers
This fixes the black Feral launcher window.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101867

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2017-07-28 00:24:39 +02:00
Nicolai Hähnle
06e20c4b8c radeonsi: bail out instead of crashing if the main shader part failed to compile
Reviewed: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:45 +02:00
Nicolai Hähnle
4dd86631f4 radeonsi: update a comment for merged shaders
Reviewed: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:45 +02:00
Nicolai Hähnle
4738dd9546 radeonsi/gfx9: dump previous stage LLVM IR for merged shaders
Reviewed: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:45 +02:00
Nicolai Hähnle
760876a7b1 radeonsi: make sure TCS main output VGPRs don't alias inputs
Avoids an unnecessary move introduce by "radeonsi/gfx9: always wrap GS
and TCS in an if-block (v2)"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:42 +02:00
Nicolai Hähnle
081ac6e5c6 radeonsi/gfx9: always wrap GS and TCS in an if-block (v2)
With merged ESGS shaders, the GS part of a wave may be empty, and the
hardware gets confused if any GS messages are sent from that wave. Since
S_SENDMSG is executed even when EXEC = 0, we have to wrap even
non-monolithic GS shaders in an if-block, so that the entire shader and
hence the S_SENDMSG instructions are skipped in empty waves.

This change is not required for TCS/HS, but applying it there as well
simplifies the logic a bit.

Fixes GL45-CTS.geometry_shader.rendering.rendering.*

v2: ensure that the TCS epilog doesn't run for non-existing patches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:32 +02:00
Nicolai Hähnle
873789002f radeonsi/gfx9: fix vertex idx in ES with multiple waves per threadgroup
Cc: mesa-stable@lists.freedesktop.org
Reviewed: Marek Olšák <marek.olsak@amd.com>
2017-07-27 21:16:32 +02:00
George Kyriazis
194ff5eed1 swr: fix transform feedback logic
The shader that is used to copy vertex data out of the vs/gs shaders to
the user-specified buffer (streamout or SO shader) was not using the
correct offsets.

Adjust the offsets that are used just for the SO shader:
- Make sure that position is handled in the same special way
  as in the vs/gs shaders
- Use the correct offset to be passed in the core
- consolidate register slot mapping logic into one function, since it's
  been calculated in 2 different places (one for calcuating the slot mask,
  and one for the register offsets themselves

Also make room for all attibutes in the backend vertex area.

Fixes:
- all vtk GL2PS tests
- 18 piglit tests (16 ext_transform_feedback tests,
  arb-quads-follow-provoking-vertex and primitive-type gl_points

v2:

- take care of more SGV slots in slot mapping logic
- trim feState.vsVertexSize
- fix GS interface and incorporate GS while calculating vsVertexSize

Note that vsVertexSize is used in the core as the one parameter that
controls vertex size between all stages, so it has to be adjusted appropriately
for the whole vs/gs/fs pipeline.

Also note that GS and SO is not fully implemented.  This will be addressed
later.

fixes:
- fixes total of 20 piglit tests

CC: 17.2 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-27 13:54:19 -05:00
Tim Rowley
e21fc2c625 swr/rast: non-regex knob fallback code for gcc < 4.9
gcc prior to 4.9 didn't implement <regex>, causing a startup crash
in the swr knob parameter reading code.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-27 08:31:21 -05:00
Timothy Arceri
2c34b49d9e mesa: check that buffer object is not NULL before initializing it
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-27 22:19:52 +10:00
Timothy Arceri
6ee3323d7d glsl: small builtin inline tidy up
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-27 22:14:37 +10:00
Dave Airlie
c4652a0a5b virgl: encode index buffer offset.
Fixes arb_vertex_buffer_object-combined-vertex-index

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-27 16:10:07 +10:00
Michel Dänzer
57132d126f st/mesa: Fix inversed test in st_api_destroy_drawable
Fixes a drawable leak.

Fixes: bbc29393d3 ("st/mesa: create framebuffer iface hash table per
                      st manager")
Bugzilla: https://bugs.freedesktop.org/101930
Tested-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-27 11:12:24 +09:00
Dave Airlie
e77ff11ffe radv/ac: port SI TC L1 write corruption fix.
This ports 72e46c988 to radv.
    radeonsi: apply a TC L1 write corruption workaround for SI

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:39:24 +01:00
Dave Airlie
d4b079e708 radv/winsys: fix padding command stream for SI
We were adding pad to size after creating the object, so we could
submit a CS bigger than the bo created for it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:38:23 +01:00
Dave Airlie
a81e99f50a radv/ac: realign SI workaround with radeonsi.
This ports: da7453666a
radeonsi: don't apply the Z export bug workaround to Hainan
to radv.

Just noticed in passing.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:38:17 +01:00
Jason Ekstrand
f6e478c213 i965/clear: Don't perform redundant depth clears
We already have this little optimization for color clears.  Now that
we're actually tracking whether or not a slice has any fast-clear
blocks, it's easy enough to add for depth clears too.

Improves performance of GFXBench 4 TRex at 1920x1080 by:
- Skylake GT4: 0.905932% +/- 0.0620197% (n = 30)
- Apollolake:  0.382434% +/- 0.1134730% (n = 25)

v2: (by Ken) Rebase and drop intel_mipmap_tree.c changes, as they're
    no longer necessary (other patches already landed to do that part)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-26 14:43:01 -07:00
Jason Ekstrand
6db193701e i965: Only do depth resolves prior to clearing when needed
When changing the clear value, we need to resolve any fast cleared data.

Previously, we were performing resolves on every slice with HiZ enabled.
We only need to resolve slices that a) have fast clear data, and b)
aren't about to be cleared to the new color.  In the latter case, we
were actually doing a resolve, and then a fast clear - when we could
skip both, causing the existing fast cleared area to be updated to the
new clear value for no additional work.

This patch stops using intel_miptree_prepare_access in favor of a more
optimal open coded loop that knows about our clear operation.

v2: (by Ken) Rebase on islification, write a real commit message.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 14:43:01 -07:00
Kenneth Graunke
e1d4030b0b i965: Expose get_num_logical_layers outside of intel_mipmap_tree.c.
I want to use it in brw_clear.c.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 14:43:00 -07:00
Marek Olšák
5e81df0f10 ac/surface: fix hybrid graphics where APU=GFX9, dGPU=older
v2: don't do it for compressed textures (bpp = 0)

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-07-26 19:53:26 +02:00
Marek Olšák
ed2b3f5c81 radeonsi: decrease the number of compiler threads
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-26 19:53:26 +02:00
Marek Olšák
433f6f7ac9 gallium/radeon: make S_FIXED function signed and move it to shared code
This fixes a bug uncovered by:
    2412c4c81e
    util: Make CLAMP turn NaN into MIN.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-26 19:53:26 +02:00
Marek Olšák
033b4e4340 st/mesa: also clamp and quantize per-unit lod bias
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-26 19:53:26 +02:00
Marek Olšák
914f11e75b st/mesa: fix unconditional return in st_framebuffer_iface_remove
Noticed by James Legg @ Feral.

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-26 16:47:17 +02:00
Marek Olšák
a7617a49fb drirc: whitelist glthread for Mount and Blade Warband
From 25-26 min fps to 31, used the game in conjuction with a mod (full
invasion 2) beaumaris castle map and 200 bots.
2017-07-26 15:23:00 +02:00
Grigori Goronzy
39bf7756b9 egl: move KHR_no_error vs debug/robustness check further down
We'll fail to flag an error if the context flags appear after the
no-error attribute in the context attribute list.

Delay the check to after attribute parsing to fix this.

Fixes: 4909519a66 ("egl: Add EGL_KHR_create_context_no_error support")
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: add fixes/stable tags, commit message polish]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-26 11:50:32 +01:00
Andres Rodriguez
a973b9a9f8 radv: rename physical_device->uuid[] to cache_uuid[]
We have a few UUIDs, so lets be more specific.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-26 20:42:36 +10:00
Nicolai Hähnle
a0e6b9a2db radeonsi/gfx9: reduce max threads per block to 1024 on gfx9+
The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-26 11:51:00 +02:00
Nicolai Hähnle
65fbaab0b7 radeonsi: fix detection of DRAW_INDIRECT_MULTI on SI
The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).

While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-26 11:48:32 +02:00
Iago Toral Quiroga
31f1863ace anv: only expose up to 28 vertex attributes
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 08:16:43 +02:00
Iago Toral Quiroga
a848e693ef anv/cmd_buffer: fix off by one error in assertion
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-26 08:02:06 +02:00
Kenneth Graunke
445367242a i965: Shut up Coverity warning about HiZ buffers.
Here the AUX_USAGE_* mode indicates that we have HiZ, so we will have
a HiZ buffer.  But Coverity doesn't know that, so it thinks it might
be NULL because we checked hiz_buf != NULL earlier.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-25 22:14:21 -07:00
Kenneth Graunke
698636cc97 i965: Fix = vs == in MCS aux usage assert.
Caught by Coverity (CID 1415680).

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-25 22:14:21 -07:00
Kenneth Graunke
f6e674fa51 i965: Fix offset addition in get_isl_surf.
Increase the value, not the pointer to the stack variable.

Caught by Coverity (CID 1415574).  Not shipped in a real release.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-25 22:14:21 -07:00
Andres Rodriguez
7b48163d7c mesa/st: fix inconsistent indentation of st_cb_bufferobjects.c
No changes, just re-indent.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-26 14:54:46 +10:00
Timothy Arceri
b0333e55b7 compiler: move glsl_interface_packing enum to shader_enums.h
This allows us to drop the duplicate gl_uniform_block_packing enum.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-26 10:39:52 +10:00
Timothy Arceri
7ee383669f mesa/st: fix unused variable warnings
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-26 10:39:52 +10:00
Timothy Arceri
87e5f39cf1 mesa/st: move st_pipe_format_to_mesa_format() call to where its used
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-26 10:39:52 +10:00
Timothy Arceri
17f05e52e7 gallium/util: fix unused variable warning
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-26 10:39:52 +10:00
Timothy Arceri
5fac8c116e mesa: drop useless assert
NewBufferObj() is called when the shared state is allocated so we
wouldn't get this far if it was NULL.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-26 10:16:20 +10:00
Timothy Arceri
6be1c69b97 mesa: call binding functions directly from glDeleteBuffers
This avoids useless error checking.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-26 10:16:20 +10:00
Timothy Arceri
003c8b1167 mesa: move static binding functions above _mesa_DeleteBuffers()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-26 10:16:20 +10:00
Timothy Arceri
4943353bff mesa: don't try to re-generate the default buffer
It should have been created by this point.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-26 10:16:20 +10:00
Eric Anholt
4d4872708e broadcom/vc4: Switch the V3D 2.1 XML over to restricted address fields.
This keeps the flags out of v3d_decode.c's output.  In the generated code,
only the unpack functions see any change (where they now get the
restricted start value), and vc4 doesn't use the unpack functions yet.
2017-07-25 14:55:12 -07:00
Eric Anholt
82fdc10606 broadcom/genxml: Support address fields with <32 bits
I was writing the XML such that the address field overlapped various flags
in the alignment bits, which caused pain when trying to unpack for decode.
Instead, keep the XML matching the docs (address fields don't overlap),
and just infer the appropriate shift value during decode.

During pack, the address is just applied to the appropriate bits
already, ignoring the sub-byte start/end fields.
2017-07-25 14:55:12 -07:00
Eric Anholt
53492917e2 broadcom/vc4: Use the RA callback to improve register selection's choices.
We simply pick r4 if available (anything else would force a MOV), then
round-robin through accumulators (avoids physical regfile RAW delay
slots), then round-robin through the physical regfile.

The effect on instruction count is pretty impressive:

total instructions in shared programs: 76563 -> 74526 (-2.66%)
instructions in affected programs:     66463 -> 64426 (-3.06%)

and we could probably do better with a little heuristic of "if we're going
to choose a physical reg, and other operands of instructions using this as
a src have the same physical regfile, then use the other regfile".
2017-07-25 14:55:10 -07:00
Eric Anholt
7a34a0e890 ra: Add a callback for selecting a register from what's available.
VC4 has had a tension, similar to pre-Sandybridge Intel, where we want to
use low-numbered registers (more parallelism on Intel, fewer delay slots
on vc4), but in order to give instruction scheduling the most freedom to
avoid delays we want to round-robin between registers of the same cost.
Our two heuristics so far have chosen one end or the other of that
tradeoff.

The callback, instead, hands the driver the set of registers that are
available, and the driver gets to make its own choice.  This will be used
in vc4 to round-robin between registers of the same cost, and might be
used in the future for improving bank selection.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 14:44:52 -07:00
Eric Anholt
3dae034423 ra: Don't put a node in its own adjacency set.
All the paths looping over adjacency had guards against considering
themselves (the non-obvious one was ra_any_neighbors_conflict(), which has
in_stack set).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 14:44:52 -07:00
Eric Anholt
30146f29a7 ra: Pull the body of a loop out to a helper function.
I was going to indent this code another level, and decided it would be
easier to read as a helper.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 14:44:52 -07:00
Eric Anholt
16e17ce04b broadcom/vc4: Scissor blits performed using the rendering engine.
Without this, a BlitFramebuffer would mark the whole framebuffer as being
changed (so we emit loads/stores of all of it) rather than just the
modified subset.
2017-07-25 14:44:52 -07:00
Eric Anholt
93fec49a75 broadcom/vc4: Prefer blit via rendering to the software fallback.
I don't know how I managed to leave this here for so long.  Found when
working on a 1:1 overlapping blit extension for X11.

Cc: mesa-stable@lists.freedesktop.org
2017-07-25 14:44:52 -07:00
Eric Anholt
b3c78a51f3 broadcom/vc4: Switch the Viewport Center fields to a fixed-point representation.
This gets us automatic CL decoding to a floating-point value, and drops a
magic number from the emit code.  250x250 shader runner tests now say they
have a center of 125.0 instead of 2000.
2017-07-25 14:44:52 -07:00
Eric Anholt
299c9a2db1 broadcom/vc4: Use the XML decoder for CL dumping.
The VC4_DEBUG_CL output goes from:

0x00000010 0x00000010: 0x06 VC4_PACKET_START_TILE_BINNING
0x00000011 0x00000011: 0x38 VC4_PACKET_PRIMITIVE_LIST_FORMAT
0x00000012 0x00000012: 0x12
0x00000013 0x00000013: 0x66 VC4_PACKET_CLIP_WINDOW
0x00000014 0x00000014: 0x00
0x00000015 0x00000015: 0x00
0x00000016 0x00000016: 0x00
0x00000017 0x00000017: 0x00
0x00000018 0x00000018: 0xfa
0x00000019 0x00000019: 0x00
0x0000001a 0x0000001a: 0xfa
0x0000001b 0x0000001b: 0x00

to:

0x00000010 0x00000010: 0x06 Start Tile Binning
0x00000011 0x00000011: 0x38 Primitive List Format
    Data Type: 1 (16-bit index)
    Primitive Type: 2 (Triangles List)
0x00000013 0x00000013: 0x66 Clip Window
    Clip Window Height in pixels: 250
    Clip Window Width in pixels: 250
    Clip Window Bottom Pixel Coordinate: 0
    Clip Window Left Pixel Coordinate: 0

v2: Squash in robher's fixes for Android
2017-07-25 14:44:52 -07:00
Eric Anholt
5b102160ae broadcom/genxml: Introduce a V3D packet/struct decoder.
This is copied from Intel's XML decoder, modified to handle V3D's
byte-oriented packets.

v2: Squash in robher's fixes for Android
2017-07-25 14:44:52 -07:00
Eric Anholt
12b55c8e27 broadcom: add editorconfig
This is the same 8-space style used in the vc4 and vc5 gallium drivers.
2017-07-25 14:44:52 -07:00
Eric Anholt
decd2b32aa intel/decoder: Reuse the gen_make_gen() helper.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-25 14:44:52 -07:00
Eric Anholt
19ffa4bfb2 intel/decoder: Reuse the MAX2 macro instead of defining another one.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-25 14:44:52 -07:00
Brian Paul
91735e2d4a svga: implement MSAA alpha_to_one feature
The device doesn't directly support this feature so we implement it with
additional shader code which sets the color output(s) w component to
1.0 (or max_int or max_uint).

Fixes 16 Piglit ext_framebuffer_multisample/*alpha-to-one* tests.

v2: only support unorm/float buffers, not int/uint, per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-25 15:40:24 -06:00
Brian Paul
71d3b69b23 svga: rework the FS white fragments code
When we forcibly write white to FS outputs (for XOR mode emulation)
we were using a temp register.  But that's not really necessary.
This also fixes the case of writing white to multiple color buffers.

Subsequent changes will build on this.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-25 15:40:23 -06:00
Brian Paul
1ab8901d6f gallium/util: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-25 15:40:23 -06:00
Kamil Páral
22379f7cad drirc: whitelist glthread for Overlord 1+2, Oil Rush, War Thunder, Saints Row 2
Performance delta on Core i5-4570 + Radeon R9 270:
    Overlord: +20% in certain locations
    Overlord II: +20% in certain locations
    Oil Rush: +12% in most locations
    War Thunder: +4-9% in benchmarks
    Saints Row 2: +10-35% in certain locations
2017-07-25 21:29:54 +02:00
Lionel Landwerlin
9f439ae120 i965: perf: flush batchbuffers at the beginning of queries
As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.

If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).

Fixes: adafe4b733 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-25 18:56:33 +01:00
Daniel Stone
45383d32d4 st/dri2: Return invalid modifier when no driver support
Always initialise whandle.modifier for DRIImage modifier queries, so if
the driver doesn't support it then we return false for the query.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: d33fe8b84e ("st/dri: enable DRIimage modifier queries")
2017-07-25 18:40:07 +01:00
Daniel Stone
b4a18f13ce st/dri: Check get-handle return value in queryImage
In the DRIImage queryImage hook, check if resource_get_handle() failed
and return FALSE if so.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-25 18:40:06 +01:00
Michal Srb
e6d7937b86 r600: Add support for B5G5R5A1.
Fixes rendercheck errors when using glamor acceleration in X server.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-25 19:17:03 +02:00
Leo Liu
82fcf3142f radeon/vcn: move message buffer to vram for now
To workaround an unknown bug.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-07-25 12:27:09 -04:00
Jose Fonseca
8d655263ca trace: Correct transfer box size calculation.
For textures we must not approximate the calculation with `stride *
height`, or `slice_stride * depth`, as that can easily lead to buffer
overflows, particularly for partial transfers.

This should address the issue that Bruce Cherniak found and diagnosed.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-25 17:18:04 +01:00
Samuel Pitoiset
c3ea898932 mesa: add active_shader_program() helper
To reduce code duplication.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 11:56:06 +02:00
Samuel Pitoiset
b8338f8df2 mesa: add bind_program_pipeline() helper
To reduce code duplication.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 11:56:06 +02:00
Tapani Pälli
3392026866 egl: fix whitespace issues from eglimage code
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-25 12:54:33 +03:00
Tapani Pälli
8dba6f8cf4 util: fix warning/error on 32bit build
Add uintptr_t cast to fix 'cast to pointer from integer of different size'
warning on 32bit build (build error on Android M).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-25 12:54:33 +03:00
Constantine Charlamov
dacb319777 r600g: constify some args at r600_asm.c
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 09:24:27 +02:00
Constantine Charlamov
3823e4905b r600g: remove unused "bc" args, and one unneeded forward declaration
To ease review just highlight "bc," string.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-25 09:24:17 +02:00
Dave Airlie
6cbc8cf178 radv: only report external semaphore info for opaque fd.
Until we support sync fd, don't report the info.

Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing.

Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-25 15:38:56 +10:00
Jason Ekstrand
d49f51fbf4 i965: Simplify HiZ clears a bit
No need for all that switching when we can just assign a nice little
variable with the number of layers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-24 20:00:15 -07:00
Rafael Antognolli
8c47ccb13a i965: Use {} to initialize GENX_* structs.
gen4 have commands which start with KernelStartPointer, which is a
struct, so if we initialize it struct = { 0 }, we get warnings on some
compilers:

"GCC (pre 4.9?) can throw a Wmissing-braces on[1] while clang
-Wmissing-field-initializers [2]." - Emil

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
[2] https://bugs.llvm.org/show_bug.cgi?id=21689

This change works around that and will silence such warnings. It is both
a GCC and a clang extension.

v2:
   - Use {} instead of memset macro (Matt)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-07-24 16:07:25 -07:00
Charmaine Lee
bbc29393d3 st/mesa: create framebuffer iface hash table per st manager
With commit 5124bf9823, a framebuffer interface hash table is
created in st_gl_api_create(), which is called in
dri_init_screen_helper() for each screen. When the hash table is
overwritten with multiple calls to st_gl_api_create(), it can cause
race condition. This patch fixes the problem by creating a
framebuffer interface hash table per state tracker manager.

Fixes crash with steam.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101876
Fixes: 5124bf9823 ("st/mesa: add destroy_drawable interface")
Tested-by: Christoph Haag <haagch@frickel.club>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-24 14:03:28 -07:00
Dave Airlie
ca82ef5ac7 radv: fix buffer views on SI/CIK.
Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.

Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 21:54:04 +01:00
Daniel Stone
dd072cf4b1 egl/wayland: Ignore invalid modifiers
If the underlying driver does not support modifiers, dmabuf will still
advertise formats through the 'modifier' event, but send them with an
invalid modifier. Ignore them if this is the case, rather than passing
them through to the driver.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
2017-07-24 16:42:28 +01:00
Samuel Pitoiset
986f9e50de mesa: return GL_OUT_OF_MEMORY if NewSamplerObject fails
This is similar to other functions that create objects.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-24 16:43:38 +02:00
Samuel Pitoiset
b244846821 mesa: pass the 'caller' function to create_samplers()
To return GL_OUT_OF_MEMORY if NewSamplerObject fails.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-24 16:43:38 +02:00
Samuel Pitoiset
0bc8315e63 mesa: add compressed_tex_sub_image_{error,no_error} helpers
To avoid inlining compressed_tex_sub_image() a bunch of times.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-24 16:43:38 +02:00
Emil Velikov
5d47dd9c2a intel/blorp: ship blorp_genX_exec.h within the tarball
Fixes: c9cb37b2a6 ("intel/blorp: Add a partial resolve pass for MCS")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 15:14:21 +01:00
Emil Velikov
06e2a507eb docs: add 17.3.0-devel release notes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 14:27:15 +01:00
Emil Velikov
61883606c5 mesa: bump version to 17.2.0-devel
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 14:20:53 +01:00
Emil Velikov
33236a306d egl: guard wayland header dep. tracking behind HAVE_PLATFORM_WAYLAND
Otherwise we'll attemt to generate the header even we don't need to.
In that case the dependencies may not be met, leading to build failure.

Fixes: 166852e "configure.ac: rework wayland-protocols handling"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-07-24 14:09:08 +01:00
Emil Velikov
da9e6fdfe2 swrast: add dri2ConfigQueryExtension to the correct extension list
The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-24 10:58:51 +01:00
Emil Velikov
3057ca9a50 wayland-egl: update the SHA1 of the commit introducing v3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:36:30 +01:00
Miguel A. Vico
b6356c023d wayland-egl: Update ABI checker
This change updates wayland-egl-abi-check.c with the latest changes to
wl_egl_window.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:27:56 +01:00
Miguel A. Vico
2d5d61bc49 wayland-egl: Make wl_egl_window a versioned struct
We need wl_egl_window to be a versioned struct in order to keep track of
ABI changes.

This change makes the first member of wl_egl_window the version number.

An heuristic in the wayland driver is added so that we don't break
backwards compatibility:

 - If the first field (version) is an actual pointer, it is an old
   implementation of wl_egl_window, and version points to the wl_surface
   proxy.

 - Else, the first field is the version number, and we have
   wl_egl_window::surface pointing to the wl_surface proxy.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:27:52 +01:00
Miguel A. Vico
63c251e38f egl: Fix _eglPointerIsDereferencable() to ignore page residency
mincore() returns 0 on success, and -1 on failure.  The last parameter
is a vector of bytes with one entry for each page queried.  mincore
returns page residency information in the first bit of each byte in the
vector.

Residency doesn't actually matter when determining whether a pointer is
dereferenceable, so the output vector can be ignored.  What matters is
whether mincore succeeds. See:

  http://man7.org/linux/man-pages/man2/mincore.2.html

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:27:48 +01:00
Miguel A. Vico
045108938c egl: Move _eglPointerIsDereferencable() to eglglobals.[ch]
Move _eglPointerIsDereferencable() to eglglobals.[ch] and make it a
non-static function so it can be used out of egldisplay.c

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:27:43 +01:00
Miguel A. Vico
dad0c5d2d7 wayland-egl: Add wl_egl_window ABI checker
Add a small ABI checker for wl_egl_window so that we can check for
backwards incompatible changes at 'make check' time.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:27:10 +01:00
Emil Velikov
4d53b16f55 swr: use the correct variable for no undefined symbols
The variable name was missing a leading LD_, which resulted in a missing
check for unresolved symbols in the backend binaries.

With the link addressed with earlier patches, we can correct the typo.

Thanks to Laurent for the help spotting this.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: 9475251145 "swr: standardize linkage and check for
                             unresolved symbols"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:23:45 +01:00
Emil Velikov
9fd23435c2 swr: don't forget to link KNL/SKX against pthreads
Analogous to previous commit but for the KNL/SKX backends.

Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: 1cb5a6061c ("configure/swr: add KNL and SKX architecture targets")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:23:45 +01:00
Emil Velikov
33d397ada5 swr: don't forget to link AVX/AVX2 against pthreads
Seems like the backends have been using pthreads since day one, yet
we've been missing the link.

With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: c6e67f5a93 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 10:23:45 +01:00
Emil Velikov
166852ee95 configure.ac: rework wayland-protocols handling
At dist/distcheck time we need to ensure that all the files and their
respective dependencies are handled.

At the moment we'll bail out as the linux-dmabuf rules are guarded in a
conditional. Move them outside of it and drop the sources from
BUILT_SOURCES.

Thus the files will be generated only as needed, which will happen only
after the wayland-protocols dependency is enforced in configure.ac.

v2: add dependency tracking for the header

Cc: Andres Gomez <agomez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-07-24 10:23:41 +01:00
Dave Airlie
feef47bb59 radv: enable sample shading
This calculates ps_iter_samples from the minSampleShading input

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 17:45:03 +10:00
Dave Airlie
486472a98d radv: don't set dedicated bit for buffer external memory.
This is an alternate fix for the buffer export dedicated interaction.

Fixes CTS dEQP-VK.api.external.memory.opaque_fd.dedicated.buffer.info

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 08:30:15 +01:00
Dave Airlie
75392e76ad radv: fix non-0 based layer clears.
If the layer base was > 0, it wasn't getting passed as the start
instance or getting added in the shaders.

Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers

Fixes: 7e0382fb (radv: add support for layered clears (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 17:27:55 +10:00
Dave Airlie
22b59b99cb radv: check enabled device features.
The spec says we should return VK_ERROR_FEATURE_NOT_PRESENT.

Ported from anv.

Fixes CTS test dEQP-VK.api.device_init.create_device_unsupported_features

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 08:16:52 +01:00
Dave Airlie
b7cc07432a radv: for external memory imports close the fd on import success
If we get an fd, we need to close it before returning.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.device_only.import_multiple_times

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 04:41:36 +01:00
Bas Nieuwenhuizen
daaf7efb93 radv: Don't segfault when exporting an image which hasn't been bound yet.
The image is set on Memory allocation already, but the image doesn't
have to have the BindImageMemory called yet. Luckily, we know offset
within a BO has to be 0 for dedicated allocations, so we can just
use the dummy 0 in the address calaculations.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.image.export_bind_import_bind

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-24 01:50:52 +02:00
Bas Nieuwenhuizen
ea08a296fe radv: Handle VK_ATTACHMENT_UNUSED in color attachments.
This just sets them to INVALID COLOR,  instead of shifting the
attachments together.

This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-24 01:50:52 +02:00
Andres Gomez
bfe8134472 broadcom: correct header file in BROADCOM_FILES
This fixes `make distcheck`

> make[3]: *** No rule to make target 'common/v3d_devinfo.h', needed by 'distdir'.  Stop.
> make[3]: Leaving directory '/home/local/mesa/src/broadcom'
> Makefile:945: recipe for target 'distdir' failed
> make[2]: Leaving directory '/home/local/mesa/src'
> make[2]: *** [distdir] Error 1
> make[1]: *** [distdir] Error 1

Fixes: 427bbbb99c ("broadcom: Introduce a header for talking about chip revisions.")
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-24 01:40:05 +03:00
Wladimir J. van der Laan
15a1ceb127 etnaviv: Clear lbl_usage array correctly
Fill the entire array instead of just a quarter. This avoids
crashes with large shaders.
(currently this never causes a problem because shaders larger than 2048/4
instructions are not supported by this driver on any hardware, but it will
cause problems in the future)

Fixes: ec43605189 ("etnaviv: fix shader miscompilation with more than 16 labels")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-07-23 21:52:44 +02:00
Jason Ekstrand
6874b953f6 anv/image: zalloc image views
This allows us to avoid some extra zeroing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
a1cad8218e anv/image: Use vk_zalloc instead of an explicit memset
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
1e32c8303a anv: Separate surface states by layout instead of aux_usage
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
628bfaf1c6 intel/isl: Add some sanity checks for compressed surfaces
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
5de4209f91 intel/isl: Add a helper to get a subimage surface
We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
72bc38cfc5 anv: Get rid of some unused function declarations
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-22 21:41:12 -07:00
Jason Ekstrand
3e57e9494c i965: Enable regular fast-clears (CCS_D) on gen9+
The set of formats which supports CCS_E is actually fairly small on
gen9.  However, everything that supports fast-clears on gen8 also
supports fast-clears on gen9+.  The one very annoying exception is
that blending is broken for non-0/1 clear colors with sRGB formats.
In order to solve that problem, we do a resolve to get rid of the
clear color.  Another option would be to just not fast-clear with
non-0/1 clear colors however non-0/1 + blending + sRGB is uncommon
enough that this shouldn't be a significant performance problem.

This appears to help gl_manhattan31_off by about 2%.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
d4de403f91 intel/isl: Add a helper for determining if a color is 0/1
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
b26b2490e5 intel/blorp: Allow blorp_copy on sRGB formats
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
6c2842f95b i965: Weaken the texture view rules for formats slightly
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
fb86ac94cb intel/isl/format: Add an srgb_to_linear helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
44e9d65757 intel/isl/format: Dedent the template in gen_format_layout.py
This makes it much easier to edit the template and doesn't really dirty
the python all that much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
dd75edb429 i965/surface_state: Get the aux usage from the miptree code
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
0175077af5 i965/surface_state: Take an isl_aux_usage in emit_surface_state
This commit replaces the generic "flags" parameter with a more explicit
aux usage parameter.  This leads to a lot of duplicated code at the
moment but this will all get cleaned up directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
fe38d3e3a4 i965/miptree: Take an isl_format in prepare_texture
This will be a bit more convenient momentarily.  It's also more correct
because it makes prepare_texture take sRGB into account.
2017-07-22 20:59:22 -07:00
Jason Ekstrand
2ccfc0ffdd i965/miptree: Use miptree range helpers in has_color_unresolved
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
1c70c57aed i965/miptree: Allow for accessing a CCS_E image as CCS_D
This requires us to start using the partial clear state.  It makes
things quite a bit more complicated but it's still a fairly
straightforward exercise in diagram following.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
06ef36d319 i965/miptree: Use ISL_AUX_STATE_PARTIAL_CLEAR for CCS_D
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
268ba028dc intel/isl: Add an aux state for "partial clear"
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
d6ee832cbc i965/miptree: Take an aux_usage in prepare/finish
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
e1ce252106 i965/miptree: Refactor some things to use mt->aux_usage
Now that we have this field, it's much easier to switch on it than to
walk an if ladder that checks different things.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
427d248fbd i965/blorp: Use prepare/finish_depth for depth clears
We also simplify the way we handle stencil since we know a priori that
it will have ISL_AUX_USAGE_NONE.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
e8fca3ffde i965/blorp: Use render_aux_usage for color clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
4185b982af i965/blorp: Be more accurate about aux usage in blorp_copy
The only real change here is that we now reject clear colors for MCS
with certain formats on gen < 9 because we can't trust that the
reinterpretation will work.  This may cause some MCS partial resolves.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
fc1639e46d i965/blorp: Use texture/render_aux_usage for blits
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
0f9b609cf4 i965/blorp: Do prepare/finish manually
Our attempts to do it automatically are problematic at best.  In order
to really be precise, we need to know both the desired aux usage and
whether or not clear is supported.  The current automatic mechanism
doesn't cover this.  This commit itself is not a functional change since
it just reworks everything to be in terms of a silly helper.  Later
commits will switch things over to more sensible ways of choosing usage.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
baa9e05965 i965/miptree: Rework prepare/finish_render to be in terms of aux_usage
We keep the old and possibly broken method of determining aux usage
intact for now.  Therefore, the only functional change here is that we
may call finish_render a bit more accurately.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
c9314d2c46 i965/miptree: Add a helper for getting the aux usage for texturing
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
d3c01c6a9a i965/miptree: Partially resolve MCS for texture views
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
c8aa5191eb i965/miptree: Add support for partially resolving MCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
987c09e044 i965/miptree: Tighten up finish_mcs_write
Multisample surfaces only have a single miplevel so there's no reason to
be passing the extra parameters around.  It only leads to confusion.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
18a69bbc0f i965/miptree: Make aux_state work in terms of logical layers
This commit changes layer_range_length to return locical layers and also
changes the way we allocate the aux_state field to not allocate extra
layers for MCS.  This will be important as we're about to start doing
significantly more detailed tracking of MCS state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
c9cb37b2a6 intel/blorp: Add a partial resolve pass for MCS
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
efc4c6b702 i965/miptree: Remove some unneeded restrictions
intel_miptree_supports_ccs_e should handle the gen >= 9 requirement and
there's no reason why we can't do CCS_E on window system buffers so long
as we resolve.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
9c09672ad4 i965/miptree: Stop setting FOR_SCANOUT for renderbuffers
Nothing created through intel_miptree_create_for_renderbuffer will ever
be exposed externally so there's no need to set FOR_SCANOUT.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
9ef517276d i965/blorp: Do flushes around depth resolves
It turns out that if you have rendering in-flight with CCS_E enabled and
you go to do a depth resolve without flushing, the CCS data may never
hit the memory.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Jason Ekstrand
36aed7c74c i965/blorp: Use the renderbuffer format for clears
This fixes the Piglit ARB_texture_views rendering-formats test.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 20:59:22 -07:00
Nanley Chery
67027ddf3f anv: Predicate fast-clear resolves
Image layouts only let us know that an image *may* be fast-cleared. For
this reason we can end up with redundant resolves. Testing has shown
that such resolves can measurably hurt performance and that predicating
them can avoid the penalty.

v2:
- Introduce additional resolve state management function (Jason Ekstrand).
- Enable easy retrieval of fast clear state fields.
v3: Use more descriptive field enums (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
8e2729fbb8 intel/blorp: Allow BLORP calls to be predicated
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
be516ba9b1 anv/cmd_buffer: Skip some input attachment transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
597ff919e7 anv: Stop resolving CCS implicitly
With an earlier patch from this series, resolves are additionally
performed on layout transitions. Remove the now unnecessary implicit
resolves within render passes.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
5ba93e6f5a anv: Transition more color buffer layouts
v2: Expound on comment for the pipe controls (Jason Ekstrand).
v3:
- Cast base_layer to uint64_t to avoid overflow.
- Remove "seems" from the pipe control comment.
- Fix clamp of layer_count (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
a899747eb3 anv/cmd_buffer: Warn about not enabling CCS_E
Use the performance warning infrastructure to provide helpful
information when testing applications.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
9c9f63d1c7 anv/cmd_buffer: Move aux_usage assignment up
For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
62d72bb5d0 anv/cmd_buffer: Always enable CCS_D in render passes
The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.

v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
8e532aa028 anv/cmd_buffer: Disable CCS on gen7 color attachments upfront
The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
9fd1f2aa3c anv/cmd_buffer: Ensure fast-clear values are current
v2: Rewrite functions, change location of synchronization.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:10 -07:00
Nanley Chery
0b16600056 anv/gpu_memcpy: Add a lighter-weight GPU memcpy function
We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.

v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
dcff5ab9f1 anv/cmd_buffer: Restrict fast clears in the GENERAL layout
v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
9ffe87122b anv/cmd_buffer: Don't partially fast clear image layers
v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
07cc2ec9db anv/cmd_buffer: Initialize the clear values buffer
v2: Rewrite functions.
v3 (Jason Ekstrand):
- Don't set ResourceMinLOD.
- Fix clamp of level_count.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
88200e87f6 anv/image: Append CCS/MCS with a fast-clear state buffer
v2: Update comments, function signatures, and add assertions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
325ecffc62 anv/image: Disable CCS if the image doesn't support rendering
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
01db9a74c6 intel/isl: Add surface state clear value information
This will be used to load and store clear values from surface state
objects.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-22 20:12:09 -07:00
Nanley Chery
b178e239dd anv: Transition MCS buffers from the undefined layout
v2: Define MCS buffers with any sample count (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-07-22 20:12:09 -07:00
Jason Ekstrand
f793c57cc5 intel/isl: Tighten up restrictions for CCS on gen7
It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7.  However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering.  Let's just play it safe and disallow it for now.  If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.
2017-07-22 20:12:07 -07:00
Chris Wilson
4aee05b6c6 i965/bufmgr: Add comments about GTT coherency issues.
(Patch written by Ken, but entirely comments written by Chris.)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-22 19:34:48 -07:00
Kenneth Graunke
0044de931f i965: Drop non-LLC lunacy in the program cache code.
The non-LLC story was a horror show.  We uploaded data via pwrite
(drm_intel_bo_subdata), which would stall if the cache BO was in
use (being read) by the GPU.  Obviously, we wanted to avoid that.
So, we tried to detect whether the buffer was busy, and if so, we'd
allocate a new BO, map the old one read-only (hopefully not stalling),
copy all shaders compiled since the dawn of time to the new buffer,
upload our new one, toss the old BO, and let the state upload code
know that our program cache BO changed.  This was a lot of extra data
copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.

Not only that, but our rudimentary busy tracking consistented of a flag
set at execbuf time, and not cleared until we threw out the program
cache BO.  So, the first shader upload after any drawing would hit this
"abandon the cache and start over" copying path.

This is largely unnecessary - it's just ancient and crufty code.  We can
use the same persistent mapping paths on all platforms.  On non-ancient
kernels, this will use a write combining map, which should be reasonably
fast.

One aspect that is worse: we do occasionally grow the program cache BO,
and copy the old contents to the newer BO.  This will suffer from UC
readback performance now.  To mitigate this, we use the MOVNTDQA based
streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms).  Gen4-5
are unfortunately going to be penalized.

v2: Add MOVNTDQA path, rebase on other map flag changes.
v3: Drop cache->bo_used_by_gpu too (caught by Chris Wilson).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Kenneth Graunke
8bdbc0c5b9 i965: Set MAP_PERSISTENT on program cache buffers.
Chris Wilson pointed out that this mapping really is persistant.

Shouldn't actually have any effect today, but best to set it anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Kenneth Graunke
2e3d825982 i965: Correctly set MAP_WRITE when creating the LLC program cache map.
Using a read-only mapping is completely bogus - we use this mapping to
write all new shaders to the cache.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Matt Turner
f37ede40ba i965/bufmgr: Use write-combine mappings where available
Write-combine mappings give much better performance on writes than
uncached access through the GTT.

Improves performance of GFXBench 4's gl_driver2 benchmark at 1024x768
on Apollolake by 3.6086% +/- 0.674193% (n=15).

v2: (by Ken) Rebase on lockless mappings, map_count deletion, valgrind
    updates, potential for CPU/WC maps failing, and other changes.

v3: (by Ken and Chris Wilson)

    (Ken): Rebase on set_domain -> gem_wait
    (Chris): Fix up a failed CPU/WC mmaping with a GTT mapping

    Not all objects will be mappable for direct access by the CPU
    (either using WC/CPU or WC paths), for example, a dmabuf wrapping an
    object on a foreign device or an object wrapping access to stolen
    memory. Since either the physical pages are not known or even do not
    exist, we need to use the mediated, indirect access via the GTT. (If
    one day, the kernel does suddenly start providing mediated access
    via a regular WB/WC mmapping, we no longer need the fallback.)

v4: Avoid falling back for MAP_RAW (Chris).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-22 19:34:42 -07:00
Kenneth Graunke
bdae2ddff8 i965/bufmgr: Skip wait ioctl when not busy.
If the buffer is idle, we I915_GEM_WAIT will return immediately,
so we may as well skip the ioctl altogether.  We can't trust the
"idle" flag for external buffers, but for most, it should be fine.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Kenneth Graunke
38e2142f39 i965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.
With the advent of asynchronous maps, domain tracking doesn't make a
whole lot of sense.  Buffers can be in use on both the CPU and GPU at
the same time.  In order to avoid blocking, we stopped using set_domain
for asynchronous mappings, which means that the kernel's tracking has
lies.  We can't properly track it in userspace either, as the kernel
can change domains on us spontaneously (for example, when un-swapping).

According to Chris Wilson, I915_GEM_SET_DOMAIN does the following:

1. pins the backing storage (acquiring pages outside of the
   struct_mutex)

2. waits either for read/write access, including inter-device waits

3. updates the domain, clflushing as required

4. marks the object as used (for swapping)

5. turns off FBC/PSR/fancy scanout caching

Item (1) is not terribly important.  Most BOs are recycled via the
BO cache, so they already have pages.  Regardless, we fixed this
via an initial set_domain in the previous patch.

We implement item (2) with I915_GEM_WAIT.  This has one downside:
we'll stall unnecessarily if we do a read-only mapping of a buffer
that the GPU is reading.  I believe this is pretty uncommon.  We
may want to extend the wait ioctl at some point.

Mesa already does item (3) itself.  For cache-coherent buffers (most on
LLC systems), we don't need to do any clflushing - the CPU and GPU views
are coherent.  For non-coherent buffers (most on non-LLC systems), we
currently only use the CPU for read-only maps, and we explicitly clflush
when necessary.

We don't care about item (4)...swapping has already killed performance.
Plus, with async maps, the kernel's domain tracking is already bogus,
so it can't do this accurately regardless.

Item (5) should be okay because we avoid cached maps of scanout buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Kenneth Graunke
eb1497e968 i965/bufmgr: Allocate BO pages outside of the kernel's locking.
Suggested by Chris Wilson.

v2: Set the write domain to 0 (suggested by Chris).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 19:34:42 -07:00
Timothy Arceri
d91108b1f4 glsl: rework misleading block layout code
From the ARB_uniform_buffer_object spec:

   ""shared" uniform blocks, the default layout, ..."

This doesn't fix anything as the default layout is already applied
at this point but fixes the misleading code/comment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-23 10:06:01 +10:00
Timothy Arceri
316b4c9ada glsl: remove placeholder comment
This was added in 2d03f48a65 and seems like it was intended
as a TODO comment in a function stub rather than a useful
code comment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-23 10:06:01 +10:00
Brian Paul
b4debc0d69 st/mesa: use proper resource target type in st_AllocTextureStorage()
When we validate the texture sample count, pass the correct
pipe_texture_target for the texture, rather than PIPE_TEXTURE_2D.

Also add more comments about MSAA.

No piglit regressions with VMware driver.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-22 13:18:56 -06:00
Brian Paul
aeade86db5 mesa: remove pointless assignments in init_teximage_fields_ms()
The NumSamples and FixedSampleLocation fields are set again later at
the end of the function so these earlier assignments aren't needed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-22 13:18:56 -06:00
Neha Bhende
1820ef64c9 svga: Limit number of immediates in shader
imm {128.0, -128.0, 2.0, 3.0} is used for lit instruction which
is not used very frequently. So allocate it only if lit instruction is used.

Tested with mtt piglit and mtt glretrace

v2: As per Charmaine's comment

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-22 13:18:56 -06:00
Charmaine Lee
83ca6b9d31 svga: fix constant indices for texcoord scale factors and texture buffer size
This patch fixes the ordering of the constant indices for texcoord scale
factor and texture buffer size to match the order they were added to the
constant buffer in svga_get_extra_constants_common().

Tested with MTT piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-22 13:18:56 -06:00
Neha Bhende
acfb1583a5 svga: fix unnormalized->normalized texture coordinate conversion
Sometimes, converting unnormalized coordinates to normalized
coordinates requires an epsilon value to produce the right texels with
nearest filtering.  Adding 0.0001 to the coordinates when the min/mag
filter is nearest fixes the issue.
Fixes piglit test fbo-blit-scaled-linear

Tested with mtt-piglit, mtt-glretrace

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-22 13:18:56 -06:00
Brian Paul
dc62ddfb39 svga: only support 4x, 8x, 16x msaa
Skip 2x MSAA, for example, since it's seldom used and just bloats
the list of pixel formats.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-22 13:18:56 -06:00
Brian Paul
922dc27273 mesa: include texture size in error messages
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-22 13:18:56 -06:00
Kenneth Graunke
665fd10396 i965: Support the mesa_no_error driconf option.
This allows us to override contexts to use no_error functionality
even if the applications themselves do not.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-22 11:42:42 -07:00
Jason Ekstrand
20533e0da7 anv/blorp: Assert isl_surf_init success in do_buffer_copy
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:21:27 -07:00
Jason Ekstrand
cf39fb06e3 anv/blorp: Explicitly set row_pitch in do_buffer_copy
We have a very specific row pitch that we want and we don't want ISL to
be changing it on us so just be explicit about it.

Fixes: a40f043034
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 08:20:07 -07:00
Kenneth Graunke
fd199fe4a8 i965: Delete gen8_draw_upload.c
For some reason we left an empty file, rather than deleting it.
2017-07-22 00:42:51 -07:00
Karol Herbst
f98a221f2d nv50/ir: disable mul+add to mad for precise instructions
fixes
    missrendering in TombRaider
    KHR-GL44.gpu_shader5.precise_qualifier
    KHR-GL45.gpu_shader5.precise_qualifier

v4: disable opt only for MAD, it's fine for SAD

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2017-07-21 23:45:18 -04:00
Karol Herbst
f9bfc93014 nv50/ir/tgsi: handle precise for most ALU instructions
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2017-07-21 23:45:18 -04:00
Karol Herbst
1d7c232fbd nv50/ir: add precise field to Instruction
v4: initialize field with NULL

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
2017-07-21 23:45:18 -04:00
Karol Herbst
4ad9e2e17a st/glsl_to_tgsi: don't optimize mul+add to mad if expression is precise
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
c5cbb9a543 gallium/docs: add precise instruction modifier
v4: add comment about intermediate rounding step to MAD

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
4611343bcc tgsi/text: parse _PRECISE modifier
v2: use str_match_no_case to fix _SAT_PRECISE detection
v4: usd is_digit_alpha_underscore to match end of mods

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
d0dfdf704d tgsi: populate precise
Only implemented for glsl->tgsi. Other converters just set precise to 0.

v2: remove precise paramter from ureg_tex_insn and ureg_memory_insn

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
28a5e7104e st/glsl_to_tgsi: handle precise modifier
all subexpression inside an ir_assignment needs to be tagged as precise.

v2: make precise handling more global inside the visitor

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
0341aea2f8 tgsi/dump: print _PRECISE modifier on Instructions
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-21 23:45:18 -04:00
Karol Herbst
af22adee4f tgsi: add precise flag to tgsi_instruction
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-21 23:45:18 -04:00
Kenneth Graunke
30d6bc470a i965: Set lower_vote_trivial in vector_nir_options_gen6 too.
There's a second struct for Gen6+.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-21 18:09:01 -07:00
Dave Airlie
22bca8ef19 radv: reset non-syncobj semaphore context after wait.
When I ported from libdrm, I forgot to add the line to reset
the sem, we just need to reset the context.

This fixes a regression in DOOM.

Fixes: 9ac1432a57 ("radv: port to new libdrm API.")
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-22 00:03:26 +01:00
Charmaine Lee
5124bf9823 st/mesa: add destroy_drawable interface
With this patch, the st manager will maintain a hash table for
the active framebuffer interface objects. A destroy_drawable interface
is added to allow the state tracker to notify the st manager to remove
the associated framebuffer interface object from the hash table,
so the associated framebuffer and its resources can be deleted
at framebuffers purge time.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101829
Fixes: 147d7fb772 ("st/mesa: add a winsys buffers list in st_context")
Tested-by: Brad King <brad.king@kitware.com>
Tested-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-20 17:34:34 -07:00
Dylan Baker
59a141c95a radv: rebase radv_entrypoints_gen.py on anv_entrypoints_gen.py
The two generators forked from each other, and they remain basically the
same. This rebases the radv version on the anv version, but with the
radv changes ported over. The result is that we get rid of the "cat |"
madness and gain mako, correct "generated by" attributions, and write
files out directly.

The only differences between the output is whitespace and comments.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-21 14:27:02 -07:00
Topi Pohjolainen
bf24c3539e i965/miptree: Clean-up unused
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
f5859b45b1 i965/miptree: Switch remaining surfaces to isl
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
38ddb3bc60 i965/miptree: Drop miptree_array_layout in get_isl_dim_layout()
This was only needed for checking gen6 stencil which is already
using isl. One could delete GEN6_HIZ_STENCIL layout altogether
but that will be gone with the rest after a while anyway.

The dim_layout converter is needed even after transition to isl
when setting up surface states - see brw_emit_surface_state().
Hence dropping the unneeded argument separately.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
61c95c94a0 i965/miptree: Relax size alignment for linear surfaces
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
bbd89c1951 i965/miptree: Store compression flag also for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
f8894fab02 i965/miptree: Check tex image allocation failures
allowing graceful failure instead of crash on assert later on.

This can be hit, for example, on SNB when trying to allocate
8kx8k CUBE_MAP against isl: x-tiled buffer size becomes
2421161984 exceeding the maximum of 1 << 31 == 2147483648.

Another way to hit this on SNB is with multisampling of over
64-bit formats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
4aea4d6d64 main/teximage: Even on failure use valid format for init()
Otherwise init_teximage_fields_ms() (called by
_mesa_init_teximage_fields()) will always assert as it can't
find valid base format.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
fbfc6a2f67 intel/isl/gen7: Don't allow multisampled surfaces with valign2
There is the same constraintg later on as assert in
isl_gen7_choose_image_alignment_el() so catch it earlier in order
to return error instead of crash.

Needed to avoid crashes with piglits on IVB and HSW:

arb_internalformat_query2.image_format_compatibility_type pname checks
arb_internalformat_query2.all internalformat_<x>_type pname checks
arb_internalformat_query2.max dimensions related pname checks
arb_copy_image.arb_copy_image-formats --samples=2/4/6/8
arb_texture_float.multisample-fast-clear gl_arb_texture_float

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
df9bb8dc05 intel/isl/gen7: Allow msaa with signed integer formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try 16I/32I in addition to GL_RGBA8I.
IvyBridge passed all tests with all sample numbers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
abb84e3f2d intel/isl/gen7: Allow msaa with 128-bit formats
These formats are already allowed by the i965 GL driver, and the
feature seems to work just fine.

There are tests for multisampled rendering in piglit:
tests/spec/ext_framebuffer_multisample which can be patched to
try GL_RGBA16F/32F/16I/16UI/32I/32UI in addition to GL_RGBA/8I.
IvyBridge passed all tests with all sample numbers and even
with 128-bit formats.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
514d68576d intel/isl: Allow 1D surfaces with compressed formats
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
a40f043034 intel/isl: Align non-tiled horizontally by cache line
in order to support blit engine.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
75f95c710f i965/miptree/gen4: Prepare x-tiled fallback for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
4ea63fab77 i965/miptree: Prepare non-tiled fallback for isl based
See brw_miptree_choose_tiling().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Topi Pohjolainen
d84f929d85 i965/miptree: Prepare has_color_unresolved() for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-22 00:14:16 +03:00
Roland Scheidegger
dbde58dd31 gallivm: handle call attributes for llvm < 4.0 in lp_add_function_attr
We had some caller using LLVMAddInstrAttributes, which couldn't be
converted to lp_add_function_attr, because attributes were only handled
for functions in this case, so fix this.
For llvm >= 4.0, this already works correctly.
(radeonsi seems to avoid setting call site attributes prior to llvm 4.0,
the patch then citing it doesn't work when calling intrinsics. But at
least for calling external functions we always used that, albeit only
for actual call attributes, not call parameter attributes, though some
quick test shows llvm seems to handle that as well. The attribute index
is sort of iffy though, since attribute 0 of the call is the actual function,
attribute 1 corresponds to the first parameter of the called function.)
(Verified with GALLIVM_DEBUG=dumpbc plus llvm-dis that the correct
attributes are shown for calls, both for llvm 4.0 and 3.3.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-21 22:46:04 +02:00
Alex Smith
af9d6a8a99 radv: Generate storage image descriptors unconditionally
We can also use storage images internally for resolves, which don't
require TRANSFER_DST usage on the image, so currently we may not create
the needed descriptors.

Just create these descriptors unconditionally.

Fixes: 0e1886efb9 ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT")
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-22 06:40:29 +10:00
Tim Rowley
d1e7153228 swr/rast: quit using linux-specific gettid()
Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.

Fixes unresolved symbol in linux scons builds.

v2: add comment in code about the 1:1 assumption.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-21 15:37:56 -05:00
Dave Airlie
eaa56eab6d radv: initial support for shared semaphores (v2)
This adds support for sharing semaphores using kernel syncobjects.

Syncobj backed semaphores are used for any semaphore which is
created with external flags, and when a semaphore is imported,
otherwise we use the current non-kernel semaphores.

Temporary imports from syncobj fd are also available, these
just override the current user until the next wait, when the
temp syncobj is dropped.

v2: allocate more chunks upfront, fix off by one after
previous refactor of syncobj setup, remove unnecessary null
check.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Dave Airlie
b5670beb31 radv/winsys: add syncobj hooks
This just adds syncobj create/destroy/export/import paths into
the winsys interface.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Dave Airlie
80562f2b77 ac/gpu: add code to detect if kernel supports sync objects.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Tim Rowley
3e03ecaaf6 swr/rast: fix memory paths for avx512 optimized avx/sse
Source/destination will not be AVX512 aligned, use the
unaligned load/store intrinsics.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-21 15:13:14 -05:00
Tim Rowley
2656a940c2 swr/rast: cache line align hottile buffers
Prevents unalignment crashes with avx512 code on gcc/clang.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-21 15:13:08 -05:00
Tim Rowley
6970f48b6e swr/rast: simdlib changes for clang/gcc
Tested with clang-4.0 and gcc-6.3.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-21 15:12:00 -05:00
Wladimir J. van der Laan
c27cbd88e6 etnaviv: Avoid duplicates in formats table
Remove the following duplicates from the formats table:

- R8G8B8A8_UNORM (V_,_T)
- R8G8B8X8_UNORM (_T,_T)
- DXT3_RGBA (_T,_T)

Only the first has an effect because the _T overrides the V_ initializer,
the latter two were harmless duplications of the same.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-21 14:41:07 +02:00
Wladimir J. van der Laan
322b34e57e etnaviv: Add support for ETC2 texture compression
Add support for ETC2 compressed textures in the etnaviv driver.

One step closer towards GL ES 3 support.

For now, treat SRGB and RGB formats the same. It looks like these are
distinguished using a different bit in sampler state, and not part of
the format, but I have not yet been able to confirm this for sure.

(Only enabled on GC3000+ for now, as the GC2000 ETC2 decoder
implementation is buggy and we don't work around that)

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-21 12:18:35 +02:00
Wladimir J. van der Laan
c8fe372a15 gallium/util: Implement util_format_is_etc
This is the equivalent of util_format_is_s3tc, but for ETC.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-21 12:17:45 +02:00
Chih-Wei Huang
c1a29e104c Android: fix spirv_info.c generation
It's incorrect to use $(LOCAL_PATH) in makefile recipes since it's
changing. The typical way to handle it is to use private variable.
Fortunately in this case we can just simplify them to $^.

See further:
https://patchwork.freedesktop.org/patch/167718/

Also simplify LOCAL_GENERATED_SOURCES.

Fixes: 2dd4e2ec (spirv: Generate spirv_info.c)

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-21 08:48:45 +03:00
Tapani Pälli
b78563f0d0 android: fix libmesa_nir build
current build did not find required include 'spirv_info.h'

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-21 08:47:56 +03:00
Matt Turner
aff108f2fd nir: Optimize find_lsb/imsb/umsb error checks
Two of the ARB_shader_ballot piglit tests hit the find_lsb case,
removing some of the noise allowed me to better debug the test when it
was failing.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-20 16:56:50 -07:00
Matt Turner
069bf7c907 i965/fs: Match destination type to size for ballot
No use in taking a 64-bit value when we know the high 32-bits are zero.
2017-07-20 16:56:50 -07:00
Matt Turner
1038d385a9 nir: Reduce destination size of ballot intrinsic when possible
Some hardware, like i965, doesn't support group sizes greater than 32.
In that case, we can reduce the destination size of the ballot
intrinsic, which will simplify our code generation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
51c1659af8 i965: Enable ARB_shader_ballot on Gen8+
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
782ef30451 i965/fs: Implement ARB_shader_ballot operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
8238930510 i965/fs: Do not move MOVs writing the flag outside of control flow
The implementation of ballotARB() will start by zeroing the flags
register. So, a doing something like

        if (gl_SubGroupInvocationARB % 2u == 0u) {
                ... = ballotARB(true);
		[...]
        } else {
                ... = ballotARB(true);
		[...]
	}

(like fs-ballot-if-else.shader_test does) would generate identical MOVs
to the same destination (the flag register!), and we definitely do not
want to pull that out of the control flow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez
f1b7c47913 i965/fs: Handle explicit flag sources in flags_read()
The implementations of the ARB_shader_ballot intrinsics will explicitly
read the flag as a source register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner
3e7b8f6cd4 nir: Add pass to scalarize read_invocation/read_first_invocation
i965 will want these to be scalar operations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
43ef75b394 nir: Add system values from ARB_shader_ballot
We already had a channel_num system value, which I'm renaming to
subgroup_invocation to match the rest of the new system values.

Note that while ballotARB(true) will return zeros in the high 32-bits on
systems where gl_SubGroupSizeARB <= 32, the gl_SubGroup??MaskARB
variables do not consider whether channels are enabled. See issue (1) of
ARB_shader_ballot.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
636fe4d1c6 nir: Add intrinsics from ARB_shader_ballot
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
41437f1b77 i965: Enable ARB_shader_group_vote 2017-07-20 16:56:49 -07:00
Matt Turner
ee9fa4ac18 i965/fs: Implement ARB_shader_group_vote operations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Francisco Jerez
93dc736f4e i965/fs: Handle explicit flag destinations in flags_written()
The implementations of the ARB_shader_group_vote intrinsics will
explicitly write the flag as the destination register.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-20 16:56:49 -07:00
Matt Turner
30b72f4126 i965/vec4: Lower ARB_shader_group_vote intrinsics
I don't expect anyone is going to care about using this in vec4 programs
(vertex/tessellation/geometry on Gen6/7), no one has come up with a good
way to implement it much less test it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
742cc6118a nir: Support lowering vote intrinsics
... trivially (as allowed by the spec!) by reusing the existing
nir_opt_intrinsics code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
d4c9d6a3b2 nir: Add pass to optimize intrinsics
Specifically, constant fold intrinsics from ARB_shader_group_vote, but I
suspect it'll be useful for other things in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Matt Turner
ba2fbbf1c0 nir: Add intrinsics from ARB_shader_group_vote
These are intrinsics rather than opcodes, because they operate across
channels.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:56:49 -07:00
Chris Wilson
0e6ad379dd i965: Rename batch->exec_objects to validation_list
Within i965, we have many different objects and confusingly when
submitting an execbuf we have lists of both our internal objects and a
list of the kernel's drm_i915_gem_exec_object with very similar names.
Rename the kernel's validation list to avoid the collison as it is only
used for interfacing with the kernel and so a peripheral use of
"object".

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-20 16:15:32 -07:00
Kenneth Graunke
8696c3e997 Revert "i965: Call intel_prepare_render() from intel_update_state()"
This reverts commit b7153c3e9f.

The point of that commit was to ensure intel_prepare_render() occurred
before color resolves on the current framebuffer.  In 0673bbfd9b
(i965: Move surface resolves back to draw/dispatch time), Jason moved
brw_predraw_resolve_framebuffer back to draw time, which is already
after a intel_prepare_render() call.  So, this is no longer necessary.

Furthermore, it caused problems.  "mpv" would only display a small
corner of movies, and Android started failing camera CTS tests.

This is because intel_prepare_render() ended up handling DRI2 events
which caused the drawable to be resized at an inopportune time, flagging
ctx->NewState |= _NEW_BUFFERS, but at a point where we've already copied
ctx->NewState, and failed to notice the newly set flag.

The lack of _NEW_BUFFERS caused us to skip 3DSTATE_DRAWING_RECTANGLE,
so the drawing ended up being clipped to an outdated framebuffer size.

Just drop the hack and go back to handling this at the proper time.

Thanks to Matti Hämäläinen (ccr), Tomasz Figa (tfiga), and Tapani Palli
for reporting these issues.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101558
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101704
Tested-by: Tapani Pälli <tapani.palli@intel.com>
2017-07-20 16:10:10 -07:00
Samuel Pitoiset
e87e4f239f mesa: remove useless assert in _mesa_TextureView()
Already checked in _mesa_choose_texture_format().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:36 +02:00
Samuel Pitoiset
1ebe4305fd mesa: remove duplicated code around framebuffer_renderbuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:34 +02:00
Samuel Pitoiset
0752428a32 mesa: remove one extra check in _mesa_DeleteTextures()
Already checked above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:32 +02:00
Samuel Pitoiset
ca7085061d mesa: make _mesa_generate_texture_mipmap() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:30 +02:00
Samuel Pitoiset
a1819704c8 mesa: inline save_array_object()
No need to check if ID is not 0 because _mesa_HashFindFreeKeyBlock()
can't generate this value.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:28 +02:00
Samuel Pitoiset
015c6eba52 mesa: inline remove_array_object()
No need to check if ID is not 0 because _mesa_lookup_vao()
already prevents this to happen.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:26 +02:00
Samuel Pitoiset
ea13aa8530 mesa: tidy up _mesa_DeleteVertexArrays()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:24 +02:00
Samuel Pitoiset
1c6c42c289 mesa: remove useless assert in texture_storage()
Already checked in _mesa_choose_texture_format().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:22 +02:00
Samuel Pitoiset
f95420d74e mesa: pass the 'caller' function to texstorage()
To be consistent with texturestorage().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:20 +02:00
Samuel Pitoiset
9f9441535a mesa: make _mesa_texture_storage() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-20 16:14:18 +02:00
Topi Pohjolainen
67b53ee418 i965: Represent depth surfaces with isl
v2 (Jason):
   - s/separate_stencil_surface/make_separate_stencil_surface/
   - drop the check for separate stencil when wrapping an
     existing buffer object with miptree. This is dead code as
     the first needs_separate_stencil() checks is
     MIPTREE_LAYOUT_FOR_BO-flag and says no.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
05232a2361 i965: Drop redundant check for non-tiled depth buffer
Depth buffers are always Y-tiled. In brw_miptree_choose_tiling()
driver opts to use linear buffers for small and 1D but this does
not apply for depth - GL_DEPTH_COMPONENT and GL_DEPTH_STENCIL_EXT
are considered first.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
c4ac0d4949 intel/isl/gen4: Represent cube maps with 3D layout
v2 (Jason): Check for !ISL_SURF_DIM_3D instead of CUBE_BIT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
f9d3880346 i965/miptree: Prepare 3D surfaces with physical 2D layout
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
ba4d0593f9 i965/miptree: Prepare aux state map for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
bec048d9e2 i965/miptree: Represent y-tiled stencil copies with isl
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
f69a2ffe44 i965/miptree: Represent w-tiled stencil surfaces with isl
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
c84cb81771 i965/miptree: Prepare compressed offsets for isl based
v2 (Jason): Simply switch to isl_surf_get_image_offset_el()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
40e75aba73 i965/miptree: Add support for imported bo offsets for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
0f795effe5 i965/fbo: Add support for isl-based miptrees in rb wrapper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
066dc9335e i965: Prepare image setup from miptree for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
76a3ce8fa5 i965: Prepare tex, img and rt state emission for isl based miptrees
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
63a43f4161 i965: Refactor miptree to isl converter and adjustment
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
f1caa6194e i965: Prepare tex (sub)image for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
7e5c8e593b i965/wm: Prepare image surfaces for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
59bf765c36 i965/wm: Fix number of layers in 3D images
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
07caa5932c i965/miptree: Prepare intel_miptree_copy() for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
a844e6a8f4 i965: Prepare blit engine for isl based miptrees
v2: Do not concern cpp, pitch and tiling which are already
    transitioned.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
8e1494f139 i965/miptree: Store chars-per-pixel even for isl based
This will significantly reduce chrun when switching remaaining
surface types to isl. After the full transition it will be easier
to calculate on-demand and drop the helper member in miptree.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
b95caac539 i965/miptree: Switch to isl_surf::row_pitch
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
37152a5596 i965/miptree: Take interleaving into account in stencil pitch
This makes intel_mipmap_tree::pitch and isl_surf::row_pitch
semantically equivalent.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
43c3b5b523 i965/miptree: Switch to isl_surf::tiling
v2 (Daniel): Use isl tiling converters instead of introducing local.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
171b72542c intel/isl: Add i915 to isl_tiling converter
v2: s/i915_tiling_to_isl_tiling(/isl_tiling_from_i915_tiling/

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
d8521b9960 i965/miptree: Use isl_tiling_to_i915_tiling()
and drop local copy.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
a92e6ff315 i965/miptree: Switch to isl_surf::samples
v2 (Jason):
   - Don't trigger miptree re-creation in vain later on with ISL
     based. Core GL uses zero to indicate single sampled while
     ISL uses one - this would cause intel_miptree_match_image()
     to always fail.
   - Now that native miptree is already using sample number of
     one, there is no need for MAX2() when converting to ISL.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
76e2f390f9 i965/miptree: Use num_samples of 1 instead of 0 for single-sampled
Patch moves "assert(brw->num_samples <= 16)" from
emit_3dstate_multisample2() to upload_multisample_state(). Latter
is the only caller of the former and passes "brw->num_samples"
as argument. Therefore it is clearer to assert in the caller.

Possible bug fix in genX(emit_3dstate_multisample2) which
doesn't have a case for num_samples == 0 in the switch
statement.

It should be noted that intel_miptree_map()/unmap() now checks
additionally for "mt->surf.samples == 1" in order to support gen6
stencil which is already transitioned to ISL. This will go away in
next patch when native miptrees start to use isl_surf::samples as
well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Topi Pohjolainen
0e8b81af7b i965/miptree: Switch to isl_surf::msaa_layout
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-20 11:32:21 +03:00
Bas Nieuwenhuizen
21d777a122 radv: Add support for VK_KHR_variable_pointers.
Just a trivial enable.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-20 09:13:01 +02:00
Bas Nieuwenhuizen
31469c0265 radv: Add VK_KHR_storage_buffer_storage_class support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-20 09:13:01 +02:00
Brian Paul
98240f6399 mesa: check API profile for GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
If we have a compat profile context, it means that GL_QUADS[_STRIP] are
supported so this query makes sense.  It's also legal for 3.2 core profile
because of a spec bug.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-19 20:09:09 -06:00
Dave Airlie
9ac1432a57 radv: port to new libdrm API.
This bumps the libdrm requirement for amdgpu to the 2.4.82.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-20 01:56:04 +01:00
Dave Airlie
aee382510e radv: introduce some wrapper in cs code to make porting off libdrm_amdgpu easier.
This just introduces a central semaphore info struct, and passes it around,
and introduces some wrappers that will make porting off libdrm_amdgpu easier.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-20 01:55:36 +01:00
Tim Rowley
1cb5a6061c configure/swr: add KNL and SKX architecture targets
Not built by default.  Currently only builds with icc.

v2:
 * document knl,skx possibilities for swr_archs
 * merge with changed loader lib selection code

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 15:12:07 -05:00
Tim Rowley
f42186b01d configure/swr: configurable swr architectures
Allow configuration of the SWR architecture depend libraries
we build for with --with-swr-archs.  Maintains current behavior
by defaulting to avx,avx2.

Scons changes made to make it still build and work, but
without the changes for configuring which architectures.

v2:
 * add missing comma for swr_archs default
 * check that at least one architecture is enabled
 * modify loader logic to make it clearer how to add archs

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 15:12:07 -05:00
Tim Rowley
131b9f644c gallium/util: fix nondeterministic avx512 detection
cpuid.7 requires cx=0 to select the extended feature leaf.

avx512 detection was using the non-indexed cpuid resulting
in random non-detection of avx512.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-19 15:12:07 -05:00
Marek Olšák
19c101f704 drirc: whitelist War Thunder (Wine) for glthread
Nominated by František Zatloukal <zatloukal.frantisek@gmail.com>
2017-07-19 16:14:47 -04:00
Andres Gomez
9375c1d896 travis: add missing wayland-protocols
> checking for WAYLAND... no
>
> configure: error: Package requirements (wayland-client >= 1.11 wayland-server >= 1.11 wayland-protocols >= 1.8) were not met:
>
> No package 'wayland-protocols' found
>
> Consider adjusting the PKG_CONFIG_PATH environment variable if you
> installed software in a non-standard prefix.
>
> Alternatively, you may set the environment variables WAYLAND_CFLAGS
> and WAYLAND_LIBS to avoid the need to call pkg-config.
> See the pkg-config man page for more details.

Also, added extra path to PKG_CONFIG_PATH env variable.

Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 22:17:41 +03:00
Chad Versace
5d69052113 anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT
We incorrectly detected VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT.  We looked
for the bit in VkImageCreateInfo::usage, but it's actually in
VkImageCreateInfo::flags.

Found by assertion failures while enabling VK_ANDROID_native_buffer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-19 11:25:50 -07:00
Andres Gomez
80a0c9745c docs: update master's release notes, news and calendar commit
This reflects closer what we are actually doing.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 19:10:10 +03:00
Andres Gomez
e6f455646a docs: avoid overwrite of LD_LIBRARY_PATH during basic testing
The LD_LIBRARY_PATH environment variable could be already defined so
we extend it and restore it rather than just overwriting it.

v2:
 - Unset the __old_ld helper variable when we are done with it.
 - Corrected test for and escaping of variables (Eric).

v3: Remove unneeded variable (Emil).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 19:10:10 +03:00
Andres Gomez
8c1d87b251 docs: add instructions to specify LLVM version for basic testing
The "Perform basic testing" and "Use the release.sh script from xorg
util-modular" sections provide some instructions to do so. We add now
some comments in order to use a recent enough LLVM version to run
dist/distcheck and the automake generated binaries.

v2: Suggested the need to define LLVM_CONFIG also before running the
    release.sh script.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 19:10:10 +03:00
Eric Engestrom
50d478036a egl: fix line continuation
Trailing space after the backslash meant the rest of the AM_CFLAGS lines
were no longer included.
This has been silently ignored because of the next line starting with
a `-` dash, instructing make to be silent about that line.

Fixes: 02cc359372 "egl/wayland: Use linux-dmabuf interface for buffers"
Cc: Daniel Stone <daniels@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-19 15:51:54 +01:00
Eric Engestrom
21c2aca6b7 gbm: fix typo
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-19 15:51:54 +01:00
Eric Engestrom
8616a9bd35 configure.ac: fix whitespace
Whitespace-only change (`diff -w` is empty).

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-19 15:51:54 +01:00
Lucas Stach
c8a0660ab4 etnaviv: advertise supported dmabuf modifiers
Simply advertise all supported modifiers, independent of the format.
Special formats, like compressed, which don't support all those modifiers
are already culled from the dmabuf format list, as we don't support
the render target binding for them.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:50 +02:00
Lucas Stach
58c3ce071c etnaviv: implement resource creation with modifier
This allows to create buffers with a specific tiling layout, which is primarily
used by GBM to allocate the EGL back buffers with the correct tiling/modifier
for use with the scanout engines.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:50 +02:00
Lucas Stach
d06cfaf4fc etnaviv: fill in modifier in etna_resource_get_handle
This allows the state trackers to know the tiling layout of the
resource and pass this through the various userspace protocols.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:50 +02:00
Lucas Stach
eebf6ee6e9 etnaviv: fold etna_screen_bo_get_handle into etna_resource_get_handle
There is no point in keeping this indirection. Makes the code easier to
follow.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com> (v1)
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:50 +02:00
Lucas Stach
8a44aa5043 etnaviv: implement resource import with modifier
This implements resource import with modifier, deriving the correct
internal layout from the modifier and constructing a render compatible
base resource if needed.

This removes the special cases for DDX and renderonly scanout allocated
buffers, as the linear modifier is enough to trigger correct handling
of those buffers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
605007d5c7 etnaviv: also update textures from external resources
This reworks the logic in etna_update_sampler_source to select the
newest resource view for updating the texture view. This should make
the logic easier to follow and fixes texture updates from imported
dma-bufs.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
836d22a2fb etnaviv: increment correct seqno for external resources
If we import a dma-buf with a sampler/pixel pipe incompatible modifier,
the imported buffer will end up in an external resource view. As
resource_changed signals the change of the imported resource, we need
to update the external view seqno, instead of the base resource seqno.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
b158ccf1d9 etnaviv: pad scanout buffer size to RS alignment
This fixes failures to import the scanout buffer with screen resolutions
that don't satisfy the RS alignment restrictions, like 1680x1050.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
68ec876a25 etnaviv: add helper to work out RS alignment
The minimum RS alignment calculation is needed in various places.
Extract a helper to avoid open-coding the calcuation at every site.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
c481880899 renderonly/etnaviv: stop importing resource from renderonly
The current way of importing the resource from renderonly after allocation
is opaque and is taking away control from the driver, which it needs in
order to implement more advanced scenarios than the simple linear scanout
with matching stride alignments.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-07-19 16:26:49 +02:00
Lucas Stach
a9fad437f7 configure.ac: bump required etnaviv libdrm version to 2.4.82
The following changes need the modifier definitions for the Vivante tiled
formats, which are shipped with libdrm 2.4.82.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-19 16:26:49 +02:00
Emil Velikov
b359957469 dri/common: use designated initializers for OptConfElems
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-19 13:28:54 +01:00
Tomasz Figa
107b9c70d0 gallium: auxiliary: Fix standalone Android build of u_cpu_detect (v2)
Commit 463b7d0332c5("gallium: Enable ARM NEON CPU detection.")
introduced CPU feature detection based Android cpufeatures library.
Unfortunately it also added an assumption that if PIPE_OS_ANDROID is
defined, the library is also available, which is not true for the
standalone build without using Android build system.

Fix it by defining HAS_ANDROID_CPUFEATURES in Android.mk and replacing
respective #ifdefs to use it instead.

v2:
 - Add a comment explaining why the separate flag is needed (Emil).

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:28:23 +01:00
Emil Velikov
644ac2b780 egl: propagate EGL_BAD_ATTRIBUTE during EGLImage attr parsing
Earlier commit refactored/split the parsing into separate hunks.
While no functional change was intended, it did not attribute that
different error is set when the attrib. value is incorrect.

Fixes:  3ee2be4113 ("egl: split _eglParseImageAttribList into per
extension functions")
Cc: Michel Dänzer <michel@daenzer.net>
Reported-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-19 13:06:50 +01:00
Emil Velikov
a0755f2e6a swr: remove unneeded fallback strcasecmp define
The last user of the function was removed with earlier commit.

Fixes: 50842e8a93 ("swr: replace gallium->swr format enum conversion")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-19 13:06:50 +01:00
Emil Velikov
8e25e23dae st/dri: list __DRI2_FENCE extension only where needed
The extension should be present (if applicable) in the list returned by
getExtensions(). AFAICT no loader has ever looked for it in
__driDriverExtensions/__driDriverGetExtensions.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-19 13:06:50 +01:00
Emil Velikov
7791949dad swrast: add dri2ConfigQueryExtension to the correct extension list
The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.

v2: Rebase

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
2017-07-19 13:06:50 +01:00
Emil Velikov
225e45d45c radeon: remove local vblank_mode option
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-19 13:06:50 +01:00
Emil Velikov
6ba3fd2d6d i915: remove local vblank_mode option
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-19 13:06:50 +01:00
Emil Velikov
b655205ff2 i965: remove local vblank_mode option
The option is only queried from the loader, which has access to the
dri common code in src/mesa/drivers/dri/common/.

One could grant the loader access to brw_config_options but even
then, having the same option in both places is not a good idea.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-19 13:06:50 +01:00
Gwan-gyeong Mun
3f6cc931eb egl/dri2: remove unused buffer_count variable
It removes unused buffer_count variable from dri2_egl_surface.
And it polishes the assert of dri2_drm_get_buffers_with_format().

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:06:50 +01:00
Gwan-gyeong Mun
faada25f47 egl/drm: Format code in platform_drm.c according to style guide.
This is a tiny housekeeping patch which does the following:
  * Limit lines to 78 or fewer characters.
According to the mesa coding style guidelines.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:06:50 +01:00
Gwan-gyeong Mun
7c89585551 egl/drm: add going out of the loop when the designated buffer is found
Because the color_buffers have a each unique bo, if the designated buffer is
found, release_buffer() can go out the loop which seaches the buffer.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:06:50 +01:00
Gwan-gyeong Mun
89505f7ead gbm: fix typo in doxygen comment
This fixes the misspelling of gbm_bo_import api param.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:06:50 +01:00
Daniel Stone
46dace14ff egl: Add MKDIR_GEN definition
Adding linux-dmabuf Wayland protocol files as generated did the right
thing, by prepending $(MKDIR_GEN) so autotools didn't try to write into
a build directory which didn't yet exist.

Unfortunately MKDIR_GEN needs to be defined in every Makefile it's used
in (which we do now), or alternately defined and substituted in
configure.ac (which we don't do), and src/egl/ didn't actually have it
from either method. As unset variables expand to nothing, it was
silently being skipped.

Copy & paste the defintion to make sure drivers/dri2/ exists before we
try to generate files into it.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Nick Sarnie <commendsarnex@gmail.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Fixes: 02cc359372 ("egl/wayland: Use linux-dmabuf interface for buffers")
2017-07-19 13:05:02 +01:00
Kenneth Graunke
2412c4c81e util: Make CLAMP turn NaN into MIN.
The previous implementation of CLAMP() allowed NaN to pass through
unscathed, by failing both comparisons.  NaN isn't exactly a value
between MIN and MAX, which can break the assumptions of many callers.

This patch changes CLAMP to convert NaN to MIN, arbitrarily.  Callers
that need NaN to be handled in a specific manner should probably open
code something, or use a macro specifically designed to do that.

Section 2.3.4.1 of the OpenGL 4.5 spec says:

   "Any representable floating-point value is legal as input to a GL
    command that requires floating-point data. The result of providing a
    value that is not a floating-point number to such a command is
    unspecified, but must not lead to GL interruption or termination.
    In IEEE arithmetic, for example, providing a negative zero or a
    denormalized number to a GL command yields predictable results,
    while providing a NaN or an infinity yields unspecified results."

While CLAMP may apply to more than just GL inputs, it seems reasonable
to follow those rules, and allow MIN as an "unspecified result".

This prevents assertion failures in i965 when running the games
"XCOM: Enemy Unknown" and "XCOM: Enemy Within", which call

   glTexEnv(GL_TEXTURE_FILTER_CONTROL_EXT, GL_TEXTURE_LOD_BIAS_EXT,
            -nan(0x7ffff3));

presumably unintentionally.  i965 clamps the LOD bias to be in range,
and asserts that it's in the proper range when converting to fixed
point.  NaN is not, so it crashed.  We'd like to at least avoid that.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-18 23:48:46 -07:00
Kenneth Graunke
0320bb2c6c nir: Use nir_src_copy instead of direct assignments.
If the source is an indirect register, there is ralloc'd data.  Copying
with a direct assignment will copy the pointer, but the data will still
belong to the old instruction's memory context.  Since we're lowering
and throwing away instructions, that could free the data by mistake.

Instead, use nir_src_copy, which properly handles this.

This is admittedly not a common case, so I think the bug is real,
but unlikely to be hit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-18 23:44:50 -07:00
Timothy Arceri
57165f2ef8 glsl: disable array splitting for AoA
While it produces functioning code the pass creates worse code
for arrays of arrays. See the comment added in this patch for more
detail.

V2: skip splitting of AoA of matrices too.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-19 11:06:23 +10:00
Timothy Arceri
3f0fb23b03 nir: fix nir_opt_copy_prop_vars() for arrays of arrays
Previously we only incremented the guide for a single
dimension/wildcard.

V2: rework logic to avoid code duplication

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-07-19 11:06:23 +10:00
Jason Ekstrand
ecf91898e0 nir/vars_to_ssa: Handle missing struct members in foreach_deref_node
This can happen if, for instance, you have an array of structs and there
are both direct and wildcard references to the same struct and some
members only have direct or only have indirect.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-19 11:06:23 +10:00
Kenneth Graunke
d9015b1eab i965/blorp: Use the return value of brw_emit_reloc.
This guarantees that the value written in the batch matches the
value recorded in the relocation entry.

(Chris Wilson wrote an identical patch as well.)
2017-07-18 15:53:33 -07:00
Kenneth Graunke
77844406d5 i965: Delete dead brw_program_reloc function.
Rafael eliminated the last use of brw_program_reloc recently.
2017-07-18 15:45:26 -07:00
Rafael Antognolli
d883ec0400 i965: Convert WM_STATE to genxml on gen4-5.
The code doesn't get exactly a lot simpler but at least it is in a single
place, and we delete more than we add.

Another good point is that you get rid of struct brw_wm_unit_state
which was a third mechanism for encoding GEN state. We used to have
GENXML, manual packing and these bitfield structs. Now we're down to
just GENXML and some manual packing. (Khristian)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-18 15:45:26 -07:00
Rafael Antognolli
e490382326 i965: Convert CLIP_STATE to genxml.
Add the code into its own function and atom, since almost nothing is
shared with GEN >= 6.

v2: Split GEN <=5 and GEN >= 6 into separate functions (Ken).
v3: Minor tidying by Ken.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-18 15:45:26 -07:00
Daniel Stone
02cc359372 egl/wayland: Use linux-dmabuf interface for buffers
When available, use the zwp_linux_dambuf_v1 interface to create buffers,
which allows multiple planes and buffer modifiers to be used.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:21 +01:00
Daniel Stone
cfaca5742e egl/wayland: Remove duplicate wl_buffer creation code
Now create_wl_buffer is generic enough, we can use it for the
EGL_WL_create_wayland_buffer_from_image extension.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:21 +01:00
Daniel Stone
6595c69951 egl/wayland: Remove more surface specifics from create_wl_buffer
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:21 +01:00
Daniel Stone
c4a1c7a2eb egl/wayland: Make create_wl_buffer more generic
Remove surface-specific code from create_wl_buffer, so it's now just a
generic translation from DRIimage to wl_buffer.

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:20 +01:00
Daniel Stone
7f157a21f1 gbm: Remove is_planar_format dead code
This was only used in create_dumb() to blacklist planar formats.
However, the start of the function already whitelists ARGB8888 (cursor)
and XRGB8888 (scanout), and nothing else. So this entire function can be
removed.

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:20 +01:00
Daniel Stone
7ac09e0c55 gbm: Check harder for supported formats
Luckily no-one really used the is_format_supported() call, because it
only supported three formats.

Also, since buffers with alpha can be displayed on planes, stop banning
them from use.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:20 +01:00
Daniel Stone
2ede894384 gbm: Pull out FourCC <-> DRIimage format table
Rather than duplicated (yet asymmetric) open-coded tables, pull them out
to a common structure.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:20 +01:00
Daniel Stone
6f8d8b17a1 gbm: Axe buffer import format conversion table
Wayland buffers coming from wl_drm use the WL_DRM_FORMAT_* enums, which
are identical to GBM_FORMAT_*. Similarly, FD imports do not need to
convert between GBM and DRI FourCC, since they are (almost) completely
compatible.

This widens the formats accepted by gbm_bo_import() when importing
wl_buffers; previously, only XRGB8888, ARGB8888, RGB565 and YUYV were
supported.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 22:16:20 +01:00
Topi Pohjolainen
28ccf8587e i965/gen4: Set tile offsets to zero after depth rebase
Current logic calls intel_renderbuffer_set_draw_offset() which in
turn tries to calculate x and y offset against layer/level settings
that are against the original miptree actually having sufficient
levels/layers. This returns correctly x=0 y=0 regardless of the given
layer/level only because one calls intel_miptree_get_image_offset()
which goes and consults miptree offset table which in turn luckily
contains entries for max-mipmap levels, all initialised to zero even
in case of non-mipmapped.

This patch stops consulting the table and simply sets the draw
offsets to zero that are compatible with the single slice miptree
backing the renderbuffer.
This prepares for ISL based miptrees that calculate offsets
on-demand and do not tolerate levels beyond what the miptree has.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
7507563291 i965: Refactor check for separate stencil
v2 (Jason): s/needs_stencil/needs_separate_stencil/

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
0926fb69a4 intel/blorp/gen4: Drop cube map flag for single face copy
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
91608e4ac1 i965/wm: Use level offsets directly
dropping dependency to slice table.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
5ff1d76caa i965: Use offset helper in intel_readpixels_tiled_memcpy()
providing support for isl based.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
b8d63f50ee i965/miptree: Pass flags instead of explicit tiling to surface creator
allowing one to use isl tiling filter.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
f23599fa5b i965/miptree: Add pitch override for imported buffer objects
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
d401bd3e8f i965/miptree: Stop setting total_width/height for existing bo
Now that image surface vertical slice calculator doesn't depend
on total_height, total dimensions are only needed when new buffer
objects are created. Therefore one can safely ignore them when
miptrees are created for already exisiting buffer objects.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:13 +03:00
Topi Pohjolainen
0a287e4501 i965/wm: Use isl for filling tex image parameters
This helps to drop dependency to miptree::total_height which is
used in brw_miptree_get_vertical_slice_pitch().

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:36:03 +03:00
Topi Pohjolainen
4733891e51 intel/isl: Take 3D surfaces into account in image params
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:44 +03:00
Topi Pohjolainen
2309363868 i965/miptree: Check for miptree_create() failures
Rest of the function assumes it always succeeds.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
0320d9bd84 i965/miptree: Do not rely on msaa type to decide if aux is needed
Once the driver moves to ISL both compressed and uncompressed have
the same type. One needs to tell them apart by other means. This
can be done by checking the existence of mcs_buf.

There is a short period of time within intel_miptree_create()
where mcs_buf doesn't exist yet (between calls to
intel_miptree_create_layout() and intel_miptree_alloc_mcs()).
First compute_msaa_layout() makes the decision if compression is
to be used and sets the msaa_layout type. Then based on the type
one sets aux_usage and finally decides if mcs_buf is needed.

This patch duplicates the logic in compute_msaa_layout() and uses
that to make the decision on aux_usage and mcs_buf allocation.
Most of the original logic in compute_msaa_layout() will be gone
in later patch leaving only one version.

Elsewhere only brw_populate_sampler_prog_key_data() needs to know
if compression is used based on the msaa_type. This is now
replaced with consideration for number of samples and existence
of mcs_buf. All other occurrences consider CMS || UMS which can
be represented using single the type of ISL_MSAA_LAYOUT_ARRAY
without any tweaks.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
d86244fb16 i965: Make irb::mt_layer logical instead of physical
same as irb::layer_count. In case of copies and blits msaa
surfacas already fall to blorp which natively works with logical
slices.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
b9400b7ecd i965/tex: Use offset helper instead of accessing table directly
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
ee97b78a3e i965: Mark read-only args as const in intel_miptree_supports_hiz()
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
38f3d03ea9 i965/miptree: Use > 1 instead of > 0 to check for multisampling
Checking against zero currently works as single sampling is
represented with zero. Once one moves to isl single sampling
really has sample number of one.

This keeps later patches simpler.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Topi Pohjolainen
8fd18642e7 i965/miptree: Set refcount before failing via _release()
Otherwise one wraps uint to UINT_MAX via -1.

Fixes: 3cf470f2b6 ("i965: Add isl based miptree creator")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-18 21:35:03 +03:00
Kenneth Graunke
c2bb39d8d6 build: Add $(top_srcdir)/src/compiler/spirv to AM_CPPFLAGS
Generated C files try to include spirv_info.h.  For in-tree builds,
the header is in the same directory, so it just works.  For out-of-tree
builds, we need to look for it in srcdir rather than builddir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101831
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 11:14:47 -07:00
Marek Olšák
ecec21add2 radeonsi: add back the USE_MININUM_PRIORITY flag to the low-prio compiler queue
Accidentally removed in 9f320e0a38.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-18 13:13:34 -04:00
Jason Ekstrand
2f52d2dc97 compiler/spirv: Add a .gitignore and ignore spirv_info.c 2017-07-18 09:49:13 -07:00
Jason Ekstrand
cd9fd68a50 anv: Advertise support for VK_KHR_variable_pointers
We don't support the general version yet because that requires us to
lower shared variables up-front in SPIR-V -> NIR.  This shouldn't be a
whole lot of work but it's not something we support today.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand
bc9319583a anv: Advertise support for VK_KHR_storage_buffer_storage_class
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:13 -07:00
Jason Ekstrand
f2fe74a462 nir/spirv: Add support for SPV_KHR_variable_pointers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Jason Ekstrand
182950ceaf nir/spirv: Add a helper for pushing SSA values
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Jason Ekstrand
868456fbf7 nir/spirv: Implement OpPtrAccessChain for buffers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Jason Ekstrand
a968889237 spirv/nir: Add some useful asserts for type decorations
Now that vtn_type has piles of unions, we should assert sanity before
setting fields that may stomp others.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Jason Ekstrand
999918bd01 spirv: Add support for the StorageBuffer storage class
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Ian Romanick
2dd4e2ece3 spirv: Generate spirv_info.c
The old table based spirv_*_to_string functions would return NULL for
any values "inside" the table that didn't have entries.  The tables also
needed to be updated by hand each time a new spirv.h was imported.
Generate the file instead.

v2: Make this script work more like src/mesa/main/format_fallback.py.
Suggested by Jason.  Remove SCons supports.  Suggested by Jason and
Emil.  Put all the build work in Makefile.nir.am in lieu of adding a new
Makefile.spirv.am.  Suggested by Emil.  Add support for Android builds
based on code provided by Emil.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-18 09:43:12 -07:00
Ian Romanick
de765ec9dc spirv: Import the lastest 1.0.2 JSON from Khronos
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-18 09:43:12 -07:00
Jason Ekstrand
7141e8105a spirv: Import the latest 1.2 header from Khronos
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-18 09:43:12 -07:00
Brian Paul
9d8ebf1c77 mesa: whitespace fixes in get.c
Remove trailing whitespace.
Replace tabs with spaces.
Trivial.
2017-07-18 08:32:29 -06:00
Brian Paul
3d49fcb3e5 mesa: fix GL_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION_EXT query
This query is not allowed in GL core profile 3.3 and later (since
GL_QUADS and GL_QUAD_STRIP are disallowed).  The query was (mistakenly)
supported in GL 3.2.  This fixes the glGet error test accordingly.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
2017-07-18 08:32:29 -06:00
Eric Engestrom
a522ce9977 vulkan/util: fix typo in comment
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-18 13:56:04 +01:00
Samuel Pitoiset
838b9c21d4 mapi: add missing no_error tag to glBlitNamedFramebuffer()
Fixes: 6fedb31785 ("mesa: add KHR_no_error support for glBlitNamedFramebuffer()")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-18 10:07:34 +02:00
Alex Smith
f25c7f9f3e radv: Set the RADEON_SURF_OPTIMIZE_FOR_SPACE flag for images
This looks like a regression from df30123794 ("radv: use
ac_compute_surface"). Before that, the opt4Space addrlib flag was set
to true unless the image has FMASK (ac_compute_surface will similarly
only set that flag for images without FMASK).

This saves multiple gigabytes of VRAM on one of our games, and brings
its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-18 16:18:35 +10:00
Dave Airlie
687d241559 radv: don't shadow meta_va.
Coverity warned about dead code below, as meta_va was being shadowed.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-18 16:17:28 +10:00
Kenneth Graunke
795848c232 i965: Delete brw_sf_state.c again
"...and stay dead!"

Rafael deleted this file in c2b5a26dc2
(i965: Convert SF_STATE to genxml.) but Marek accidentally brought it
back in commit e7a091936f (mesa: replace
ctx->Polygon._FrontBit with a helper function) when resolving conflicts.

It's not actually even compiled, but it's still here trolling people
into thinking it still exists and needs patching.
2017-07-17 22:46:19 -07:00
Connor Abbott
91dd2ca99f ac/nir: rewrite shared variable handling (v2)
Translate the NIR variables directly to LLVM instead of lowering to a
TGSI-style giant array of vec4's and then back to a variable. This
should fix indirect dereferences, make shared variables more tightly
packed, and make LLVM's alias analysis more precise. This should fix an
upcoming Feral title, which has a compute shader that was failing to
compile because the extra padding made us run out of LDS space.

v2: Combine the previous two patches into one, only use this for shared
variables for now until LLVM becomes smarter.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
2017-07-17 14:16:03 -07:00
Jason Ekstrand
7947d05f84 i965: Check if the modifier is supported in select_best_modifier
Otherwise, if a client gave us a list of modifiers that contained a
modifier we understand but which is not supported on the hardware, we
might return that one and then fail to create the image.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
ec4364d57e i965: Rework the modifier info map
This commit splits the mapping in half.  The modifier_infos table now
only contains the modifier and the since_gen field.  The tiling bits
have been moved into a table in tiling_to_modifier as that's the only
place it was ever used.  The modifier_is_supported function now takes a
devinfo and does the since_gen check.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
f44171ef62 i965/surface_state: Remove the mcs_buf->offset == 0 restriction
This assert was removed in b0cc55f298 but
got added back in 1a43d774b6, probably by
accident.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
828c437078 intel/isl: Add a row_pitch parameter to surf_get_ccs_surf
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
766784ef82 i965/miptree: Use BO_ALLOC_ZEROED for CCS_E buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
cbee2d1102 i965/screen: Allocate ZEROED BOs for images
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
fb0caadc2a i965/bufmgr: Add a BO_ALLOC_ZEROED flag
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
14570ecf63 i965/miptree: Replace is_lossless_compressed with mt->aux_usage checks
Now that we have an actual aux_usage field, we no longer need the
complex logic of is_lossless_compressed in order to figure out if a
miptree is CCS_E compressed.  As a side-effect, there is not longer any
need to overload MSAA_LAYOUT_CMS for CCS_E and we can stop doing so.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
67143a5037 i965/miptree: Allocate HiZ up-front
HiZ, like MCS and CCS_E, can compress more than just clear colors so we
want it turned on whenever the miptree is being used as a depth
attachment.  It's theoretically possible for someone to create a depth
texture, upload data with glTexSubImage2D, and texture from it without
ever binding it as a depth target.  If this happens, we would end up
wasting a bit of space by allocating a HiZ surface we never use.
However, this is rather unlikely out side of test cases, so we're better
off just allocating it up-front.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
138316cc99 i965/miptree: Add an intel_tiling_supports_hiz helper
We need this split for the same reason that we need the split for CCS:
intel_miptree_supports_hiz is called *before* we choose the actual
tiling.  Adding a tiling_supports_hiz helper lets choose_aux_usage
more accurately decide whether or not to enable hiz.  In particular,
this prevents us from enabling HiZ on linear depth buffers.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-17 13:48:38 -07:00
Jason Ekstrand
e6b8877a54 i965/miptree: Gather initial aux allocation into a single function
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-17 13:48:38 -07:00
Charmaine Lee
d8f51bfcbf st/mesa: init winsys buffers list only if context creation succeeds
Fixes piglit test crash when context creation fails.

v2: As suggested by Brian, move the init to st_create_context_priv()

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-11 22:46:55 -07:00
Sinclair Yeh
ed45e8db3c winsys/svga/drm: Enable import/export fence FD
Enable the capability if the DRM supports it.

Hook up mechanism to send and receive fence FD from the DRM.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
d554f72c41 winsys/svga/drm: Connect winsys-side fence_* functions
Connect fence_get_fd, fence_create_fd, and fence_server_sync.

Implement the required functions in vmw_fence module.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
56a6e890f3 drivers/svga: Connect driver-side fence_* functions
Connect fence_get_fd, fence_create_fd, and fence_server_sync.
Return PIPE_CAP_NATIVE_FENCE_FD capability based on what the
winsys reports

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
4da543e30a winsys/svga/drm: Create winsys interface for Fence FD
The new interfaces will be used to enable
EGL_ANDROID_native_fence_sync.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
2431cccad1 winsys/svga/drm: Prepare to support fence fd
Make the fields and flags available.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Sinclair Yeh
65175df601 drivers/svga, winsys/svga/drm: Thread through timeout for fence_finish
The timeout parameter is required to implement
EGL_ANDROID_native_fence_sync.

v2
* Replaced default timeout from 0 to PIPE_TIMEOUT_INFINITE
* Add more documentation to the new timeout parameter

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-17 10:09:25 -06:00
Brian Paul
9ee86d6db7 svga: whitespace clean-up in svga_winsys.h
Trivial.
2017-07-17 10:09:25 -06:00
Brian Paul
6f4923bd38 svga: add some const qualifiers
Trivial.
2017-07-17 10:06:01 -06:00
Brian Paul
589f546256 svga: add comment about 'extra' constant locations
Trivial.
2017-07-17 10:06:00 -06:00
Jason Ekstrand
c5700ed72e anv/image: Add INPUT_ATTACHMENT to the list of required usages
From the Vulkan 1.0.53 spec VU for vkCreateImageView:

    "image must have been created with a usage value containing at least
    one of VK_IMAGE_USAGE_SAMPLED_BIT, VK_IMAGE_USAGE_STORAGE_BIT,
    VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT,
    VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT, or
    VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT"

We were missing VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT from out list.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand
cbdfd1daa2 anv: Stop leaking the no_aux sampler surface state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Jason Ekstrand
bd41564746 anv/cmd_buffer: Properly handle render passes with 0 attachments
We were early returning and never created the NULL surface state.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: James Legg <jlegg@feralinteractive.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-17 08:18:46 -07:00
Marek Olšák
c62809171c radeonsi/gfx9: add VM fault dmesg parser support
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:34 -04:00
Marek Olšák
9f320e0a38 radeonsi: automatically resize shader compiler thread queues when they are full
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:29 -04:00
Marek Olšák
4cae274116 radeonsi: prevent a deadlock in util_queue_add_job with too many GL contexts
If the queue is full, util_queue_add_job will wait while bo_fence_lock is
held.

It pb_slab wants to reuse a buffer, it will lock the pb_slab mutex and
try to check BO fence busyness, but it has to wait for bo_fence_lock to get
released. Both bo_fence_lock and pb_slab mutex are locked now.

When the CS thread unreferences and releases a suballocated buffer,
it will try to lock the pb_slab mutex and has to wait. The CS thread
can't finish its job in order to free a queue slot and unblock
util_queue_add_job ==> deadlock.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:25 -04:00
Marek Olšák
59ad769770 util/u_queue: add an option to resize the queue when it's full
Consider the following situation:
  mtx_lock(mutex);
  do_something();
  util_queue_add_job(...);
  mtx_unlock(mutex);

If the queue is full, util_queue_add_job will wait for a free slot.
If the job which is currently being executed tries to lock the mutex,
it will be stuck forever, because util_queue_add_job is stuck.

The deadlock can be trivially resolved by increasing the queue size
(reallocating the queue) in util_queue_add_job if the queue is full.
Then util_queue_add_job becomes wait-free.

radeonsi will use it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:20 -04:00
Marek Olšák
465bb47d6f radeonsi: expose ARB_timer_query unconditionally
clock_crystal_freq is always non-zero now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:57:17 -04:00
Marek Olšák
3d1a576fa6 ac/gpu_info: if clock crystal frequency is 0, print an error and set 1
During bring-up, this is often 0. Prevent automatic disablement of
ARB_timer_query and demotion of the OpenGL version to 3.2 by setting
a non-zero frequency. Print an error message instead.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:59 -04:00
Marek Olšák
d0963ef084 radeonsi/gfx9: don't read back non-existent register SRBM_STATUS2
It looks like there is no way to monitor SDMA busyness on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:56 -04:00
Marek Olšák
5fb80a1e84 radeonsi: prevent a crash with DBG_CHECK_VM and u_threaded_context
by setting PIPE_CONTEXT_DEBUG in the caller

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:51 -04:00
Marek Olšák
ddbd2f4c54 ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILE
This should lead to better MSAA performance on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:56:46 -04:00
Marek Olšák
ffa7ec9e22 radeonsi: simplify computation of tessellation offchip buffers
This is overly cautious, but better safe than sorry.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:55:07 -04:00
Marek Olšák
facfab28fe radeonsi/gfx9: add workarounds to avoid VGPR indexing completely
For inputs and outputs, indirect indexing is lowered by the GLSL compiler.
For temporaries, use alloca and disable the "promote-alloca" pass.

In the future, we could switch all codepaths to alloca permanently and
just rely on the "promote-alloca" pass.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
93391ac478 radeonsi: emit param exports after position exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
9d9ffc8475 radeonsi: move building parameter exports into a separate function
Both loops now look simple.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4e30fb4ecc radeonsi: don't use info.num_inputs when it's unused
For clarity. It's only used by color interpolation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
f8d6dd9b3d radeonsi: add si_build_fs_interp helper
This is much simpler.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
4560f2b90a radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
c351037d6c gallivm: inline gallivm_init_llvm_targets
there is only one user.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
ece0c0439f radeonsi: don't call gallivm_init_llvm_targets
It's for initializing the native (x86) target.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
d308460586 gallium/radeon: reallocate suballocated buffers when exported
This should fix exports of suballocated buffers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Marek Olšák
5b555854cc gallium/radeon: flush the context after in-place texture realloc before export
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 10:50:39 -04:00
Mark Thompson
63dcfed81f st/va: Fix scaling list ordering for H.265
Mesa here requires the scaling lists in diagonal scan order, but
VAAPI passes them in raster scan order.  Therefore, rearrange the
elements when copying.

v2: Move scan tables to vl_zscan.c.
    Fix type in size assertion.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-07-17 15:24:56 +01:00
Emil Velikov
4168c162c5 radv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:48 +01:00
Emil Velikov
43c188f970 anv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:32 +01:00
Emil Velikov
647b5a18df i965: use strtol to convert the integer deviceID override
One can override the deviceID, by setting the INTEL_DEVID_OVERRIDE
variable. A few symbolic names or a numerical value for the actual
device ID is accepted.

At the same time we're using strtod (string to double) to convert the
string to a decimal numeral. A seeming thinko, made by the original
commit that introduces the code in libdrm_intel and got here with the
import.

Fixes: 514db96c11 ("i965: Import libdrm_intel.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-17 15:23:49 +01:00
Marek Olšák
f9d5611617 gallium/u_blitter: don't use TXF for scaled blits
There seems to be a rounding difference with F2I vs nearest filtering.
The precise problem in the rounding is unknown.

This fixes an incorrect output with OpenMAX encoding.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 15:47:30 +02:00
Lionel Landwerlin
59adde0eab anv: ensure device name contains terminating character
v2: Use sizeof() (Chris)

CID: 1415113
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-07-17 14:36:38 +01:00
Lionel Landwerlin
f03f893cb8 i965: miptree: silence coverity warning
This probably can't happen, but we're better off with initialized
variables.

CID: 1415114
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-07-17 14:36:38 +01:00
Marek Olšák
0d190913bf mesa: flag _NEW_TEXTURE_OBJECT for GL_TEXTURE_LOD_BIAS_EXT
Only the compatibility profile can set it.
It was done incorrectly when we split _NEW_TEXTURE.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-17 15:27:14 +02:00
Kenneth Graunke
32c79cdacc meta: Actually initialize ImmutableLevels to 1.
Otherwise, ImmutableLevels is 0, which is an illegal value.  Later,
_mesa_meta_setup_sampler will use _mesa_texture_parameteriv to set

   texObj->MaxLevel = CLAMP(params[0], texObj->BaseLevel,
                            texObj->ImmutableLevels - 1);

which turns into a completely bogus CLAMP(value, 0, -1)...where the
upper bound is smaller than the lower bound.  This ends up being -1
today due to the way CLAMP is implemented, which is a bogus MaxLevel.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-17 01:37:51 -07:00
Kenneth Graunke
6374288b62 dri: Make classic drivers allow __DRI_CTX_FLAG_NO_ERROR.
Grigori recently added EGL_KHR_create_context_no_error support,
which causes EGL to pass a new __DRI_CTX_FLAG_NO_ERROR flag to
drivers when requesting an appropriate context mode.

driContextSetFlags() will already handle it properly for us, but the
classic drivers all have code to explicitly balk at unknown flags.  We
need to let it through or they'll fail to create a no_error context.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
2017-07-17 01:37:51 -07:00
Samuel Pitoiset
c745beaf10 ddebug: fix parsing of the pipelined mode
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-17 10:28:45 +02:00
Dave Airlie
9ee67467c9 radv: predicate cmask eliminate when using DCC.
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.

This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.

This increases the fps on Sascha Willems deferred.

Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:43 +01:00
Dave Airlie
8eed291c2c radv/clear: add r32g32b32a32 fast clear support (v2)
We can only fast clear 128-bit images if the r/g/b channels
are the same, and we are using DCC.

For DCC we'll bail out on translate if this isn't true,
and we catch cmask clears explicitly.

v2: remove 64-bit block (Bas), add uint32 as well.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:25 +01:00
Dave Airlie
acf1e132af amd/addrlib: fix typo in api name.
This fixes the misspelling of ALIGNMENTS in addrlib.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:14 +01:00
Dave Airlie
f8d5b377c8 radv: set cb base tile swizzles for MRT speedups (v4)
This patch uses addrlib to workout the tile swizzles according
to the surface index. It seems to produce the same values as
amdgpu-pro for the deferred test.

v2: don't apply swizzle to CMASK. the eg docs don't mention
it, and we clearly don't align cmask for that.
v3: disable surf index for dedicated images, as these will
most likely be shared, and I don't think the metadata has
space for this info in it yet.
v4: update for shareable images, rename combined_swizzle
to tile_swizzle

This gets the deferred demo from 730->950fps on my rx480.
(dcc cmask elim predication patches get it further)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:43:41 +01:00
Dave Airlie
b86f86f55c radv: allow clear merging for depth/stencil with no care stencil
Some of the Sascha Willems demos pick a D32/S8 format for the depth
buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means
we don't get to merge the undefined->depth and clear htile transitions.

This add the stencil aspect to the pending clears if there is a depth
clear pending and the stencil aspect is don't care.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:16:59 +01:00
Bas Nieuwenhuizen
373f707fbb radv: Remove NV dedicated alloc extension.
To not confuse apps in thinking it might be faster.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
2017-07-15 20:10:43 +02:00
Bas Nieuwenhuizen
515da29360 radv: Use the KHR dedicated alloc for the WSI.
NV isn't valid for external images anymore.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 6ddc64b93e "radv: Add support for VK_KHR_dedicated_allocation."
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
2017-07-15 20:10:25 +02:00
Jason Ekstrand
b70829708a radv: Implement VK_KHR_external_memory
This effectively reverts commit 43a171878bb4b5aedb36a.  Technically,
VK_KHR_get_memory_requirements2 and VK_KHR_dedicated_allocation are
required for the KHR version but this at least restores the removed
functionality.  This patch builds but has received zero testing.

Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-15 08:59:38 -07:00
Bas Nieuwenhuizen
6ddc64b93e radv: Add support for VK_KHR_dedicated_allocation.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-15 08:59:38 -07:00
Bas Nieuwenhuizen
97931f0297 radv: Add support for VK_KHR_get_memory_requirements2.
Fished the SparseImage call out of the headers as the spec missed
the definition.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
0ee8d81718 anv: Implement VK_KHR_external_memory_*
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
c02da9cad6 anv: Implement VK_KHR_dedicated_allocation
We always recommend sub-allocation and don't do anything special for
dedicated allocations.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
8c82aa5f43 anv: Implement VK_KHR_get_memory_requirements2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
5b57bdc1cf anv: Advertise version 1.0.54
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
227debdc92 vulkan: Update to the new 1.0.54 spec XML and headers
There is one small ANV change here because we used the
VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX enum in the BO cache and that had
to be updated to have the _KHR suffix.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:59:38 -07:00
Jason Ekstrand
3b95e03b2c radv: Drop support for VK_KHX_external_semaphore_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-15 08:58:55 -07:00
Jason Ekstrand
dc179aa123 anv: Drop support for VK_KHX_external_semaphore_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-15 08:58:51 -07:00
Jason Ekstrand
4ac94d0dee anv: Drop support for VK_KHX_external_memory_*
These have been formally deprecated by Khronos never to be shipped
again.  The KHR versions should be implemented/used instead.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-14 22:12:39 -07:00
Matt Turner
5ffe0c9e1b i965: Compile with -msse2 (instead of -msse2)
Ian noted that were were two Pentium 4 Extreme Edition LGA 775 CPUs, and
they only have SSE2.
2017-07-14 22:01:11 -07:00
Matt Turner
6b05c080f2 i965: Compile with -msse3
All CPUs that can be paired with a GPU supported by i965_dri.so supports
SSE3. This allows us to ensure that some vectorized version of the tiled
memcpy path is enabled on 32-bit systems.

This also ensures that __builtin_ia32_clflush is always usable.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101774
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-14 16:54:43 -07:00
Kenneth Graunke
c7af6d2690 egl: Fix predecence problem when setting __DRI_CTX_FLAG_NO_ERROR
This accidentally set __DRI_CTX_FLAG_NO_ERROR whenever any flags were
present.  Just needs extra parenthesis.

Fixes: 4909519a66 (egl: Add EGL_KHR_create_context_no_error support)

Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Tested-by: Mark Janes <mark.a.janes@intel.com>
2017-07-14 15:25:44 -07:00
Petr Sebor
b317cc1b3c drirc: whitelist glthread for Euro Truck Simulator 2 2017-07-14 23:56:37 +02:00
Petr Sebor
483aa3997d drirc: whitelist glthread for American Truck Simulator 2017-07-14 23:56:37 +02:00
Grigori Goronzy
d063168514 mesa/marshal: fix Windows build
This was broken by commit 1ad24faa.

Reported by AppVeyor:
https://ci.appveyor.com/project/mesa3d/mesa/build/4918
2017-07-14 23:46:21 +02:00
Edmondo Tommasina
952f21bc1e drirc: whitelist glthread for The Witcher 2
Performance delta on AMD Phenom II X3 720 / RX 470

    The Witcher 2: +18%

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 23:41:59 +02:00
Edmondo Tommasina
9a9b9882a5 drirc: whitelist glthread for Civilization 5
Performance delta on AMD Phenom II X3 720

    Civilization 5: +28%

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 23:41:59 +02:00
Tim Rowley
818209118c swr: JitManager runtime determination of architecture
Fixes performance regression from f50aa21456 - was forcing internal
code generation to target AVX (no gather, etc).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-14 15:09:22 -05:00
Andres Gomez
25d43cd656 docs: update calendar, add news item and link release notes for 17.1.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-07-14 22:30:08 +03:00
Andres Gomez
7ad2f7078d docs: add sha256 checksums for 17.1.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-07-14 22:30:08 +03:00
Andres Gomez
ea417c4c64 docs: add release notes for 17.1.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-07-14 22:30:08 +03:00
Grigori Goronzy
8d980bf920 st/mesa: Add KHR_no_error toggle to driconf
Allows applications to be whitelisted.

v2: Remove misguided DRI common part.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:44 +02:00
Grigori Goronzy
4909519a66 egl: Add EGL_KHR_create_context_no_error support
This only adds the EGL side, needs to be plumbed into Mesa frontend.

v2: Add check for extension availability.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:44 +02:00
Grigori Goronzy
2bbe235053 st/mesa: Add support for KHR_no_error flag
Add a new context flag and plumb it through the various layers of the
context creation code to set up dispatch tables for the no-error mode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:23:40 +02:00
Grigori Goronzy
7299e82fa4 dri: Add KHR_no_error DRI extension
This basic extension allows usage of the __DRI_CTX_FLAG_NO_ERROR flag.
This includes support code for classic Mesa drivers to switch on the
no-error mode if the flag is set.

v2: Move to common DRI code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:31 +02:00
Grigori Goronzy
cfbf60b0c2 mesa/marshal: fix glNamedBufferData with NULL data
The semantics are similar to glBufferData.

Tested-by: Marc Dietrich <marvin24@gmx.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:31 +02:00
Grigori Goronzy
1ad24faa11 mesa/marshal: add marshalling for glClearBuffer*
Add async marshalling/unmarshalling for all glClearBuffer variants.
These entry points are commonly used in general and Alien Isolation
specifically uses glClearBufferiv. Slightly reduces the number of
thread synchronizations with glthread in that game.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:31 +02:00
Grigori Goronzy
8036198c0f mesa/marshal: extract ClearBuffer helpers
Extract clear buffer helper functions in preparation for adding
marshal/unmarshal functions for the various glClearBuffer variants.

v2: Fix command size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 21:20:08 +02:00
Christoph Haag
98514e9959 gallium/hud: use double values for all graphs
The fps graph for example calculates the fps as double with small
variations based on when query_new_value() is called, which causes
many values to be truncated on the cast to uint64_t.

The HUD internally stores the values as double, so just use double
everywhere instead of fixing this with rounding. Using doubles also
allows the hud to show small variations instead of being clamped to
discrete values.

v2: Don't print decimals in the dump file when not necessary
Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-14 17:34:39 +02:00
Lucas Stach
7e426ef6ec Revert "etnaviv: add support for snorm textures"
This reverts commit d8b2ccdb88, which causes priglit regressions on GPUs
with SNORM support. We'll have another try at enabling this feature after
the 17.2 branchpoint.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:21:50 +02:00
Wladimir J. van der Laan
1d05cec205 etnaviv: reset indexed rendering information when not rendering indexed
A dangling bo object would result in memory corruption while loading a
level in ioquake3_opengl2.

Fixes: 330d0607ed (gallium: remove pipe_index_buffer and set_index_buffer)
Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:19:42 +02:00
Wladimir J. van der Laan
bb2498a7f6 etnaviv: Use the correct LOG instruction on GC3000
GC3000 has a new LOG instruction, similar to the new SIN and COS instructions.

Generate the new instruction sequence when appropriate; there are
two occasions, as part of LIT and the generator for the LG2
instruction itself.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-14 17:15:41 +02:00
Lucas Stach
bccd21ee88 etnaviv: flush source TS before resolve
If we blit from a rendertarget or a depthstencil buffer there might still
be dirty data in the TS buffer which needs to be flushed out.

Fixes missing shadow tiles in glmark2 shadow.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:13:12 +02:00
Philipp Zabel
e9b3381715 etnaviv: flush color cache and depth cache together before resolves
Before resolving a rendertarget or a depth/stencil resource into a
texture, flush both the color cache and the depth cache together.

It is unclear whether this is necessary for the following stall to
work properly, or whether the depth flush just adds enough time
for the color cache flush to finish before the resolver is started,
but this change removes artifacts that otherwise appear if a texture
is sampled directly after rendering into it.

The test case is a simple QML scene graph with a QtWebEngine based
WebView rendered on top of a blue background:

	import QtQuick 2.0
	import QtQuick.Window 2.2
	import QtWebView 1.1

	Window {
		Rectangle {
			id: background
			anchors.fill: parent
			color: "blue"
		}

		WebView {
			id: webView
			anchors.fill: parent
		}

		Component.onCompleted: {
			webView.url = "<some animated website>"
		}
	}

If the website is animated, the WebView renders the site contents into
texture tiles and immediately afterwards samples from them to draw the
tiles into the Qt renderbuffer. Without this patch, a small irregular
triangle in the lower right of each browser tile appears solid blue, as
if the texture sampler samples zeroes instead of the website contents,
and the previously rendered blue Rectangle shows through.

Other attempts such as adding a pipeline stall before the color flush or
a TS cache flush afterwards or flushing multiple times, with stalls
before and after each flush, have shown no effect.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-07-14 17:12:36 +02:00
Lucas Stach
a98c1fbd9b st/mesa: handle stfbi being NULL on entry of st_framebuffer_reuse_or_create
Apparently this can happen. Just bail out early in that case, as all the called
functions return NULL in that case.

Fixes weston-terminal for me.

Fixes: 147d7fb772 ("st/mesa: add a winsys buffers list in st_context")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-14 17:12:00 +02:00
Daniel Stone
5295df63ad egl/wayland: Use MIN2 for wl_drm version
Use a slightly more explicit version cap for binding wl_drm, so we can
add other interfaces with different versioning schemes later.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-14 14:14:05 +01:00
Daniel Stone
4b8ef27e84 egl/wayland: Fix whitespace damage
Convert tabs to spaces, fix misalignments.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-14 14:14:05 +01:00
Daniel Stone
2b895475f6 util: Remove u_math from u_vector
u_vector.h doesn't actually use anything from u_math, but it does mean
everyone has to pull in src/gallium/auxiliary/util includes.

Just remove it, adding a <string.h> include to u_vector.c to cover
memcpy.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-14 14:14:05 +01:00
Eric Engestrom
8821ef4be1 configure: only install khrplatform.h if needed
khrplatform.h is only used by EGL and GLES; let's only install it when
one of those is enabled.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jussi Kukkonen <jussi.kukkonen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-14 13:23:54 +01:00
Eric Engestrom
b50b4b6f84 scons: split out check_header() helper
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-14 13:23:54 +01:00
Juan A. Suarez Romero
5cd4ece34e anv/pipeline: do not use BITFIELD64_BIT()
In the previous commit, forgot to apply v2 suggestions.

Fixes: 28d0c38 (anv/pipeline: use unsigned long long constant to check
enable vertex inputs)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-07-14 10:33:19 +00:00
Juan A. Suarez Romero
28d0c38d85 anv/pipeline: use unsigned long long constant to check enable vertex inputs
When initializing the ANV pipeline, one of the tasks is checking which
vertex inputs are enabled. This is done by checking if the enabled bits
in inputs_read.

But the mask to use is computed doing `(1 << (VERT_ATTRIB_GENERIC0 +
desc->location))`. The problem here is that if location is 15 or
greater, the sum is 32 or greater. But C is handling 1 as a 32-bit
integer, which means the displaced bit is out of range and thus the full
value is 0.

Thus, use 1ull, which is an unsigned long long value.

This fixes:
dEQP-VK.pipeline.vertex_input.max_attributes.16_attributes.binding_one_to_one.interleaved

v2: use 1ull instead of BITFIELD64_BIT() (Matt Turner)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
2017-07-14 08:09:18 +00:00
Kenneth Graunke
b2da123801 i965: Use pushed UBO data in the scalar backend.
This actually takes advantage of the newly pushed UBO data, avoiding
pull loads.

Improves performance in GLBenchmark Manhattan 3.1 by:

   HSW: ~1%, BDW/SKL/KBL GT2: 3-4%, SKL GT4: 7-8%, APL: 4-5%.
   (thanks to Eero Tamminen for these numbers)

shader-db results on Skylake, ignoring programs with spill/fill changes:

   total instructions in shared programs: 13963994 -> 13651893 (-2.24%)
   instructions in affected programs: 4250328 -> 3938227 (-7.34%)
   helped: 28527
   HURT: 0

   total cycles in shared programs: 179808608 -> 172535170 (-4.05%)
   cycles in affected programs: 79720410 -> 72446972 (-9.12%)
   helped: 26951
   HURT: 1248

   LOST:   46
   GAINED: 21

Many "Deus Ex: Mankind Divided" shaders which already spilled end up
spill a lot more (about 240 programs hurt, 9 helped).  The cycle
estimator suggests this is still overall a win (-0.23% in cycle counts)
presumably because we trade pull loads for fills.

v2: Drop "PULL" environment variable left in for initial debugging
    (caught by Matt).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke
c9ef27e77b i965: Factor out push locations.
With UBOs, the answer of "have we decided to push this uniform" gets
a bit more complicated - for one, we have multiple surfaces.  This
patch refactors things so we can add the new code in a single place.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:54 -07:00
Kenneth Graunke
4f586cd8f1 i965: Push UBO data, but don't use it just yet.
This patch starts uploading UBO data via 3DSTATE_CONSTANT_* packets,
and updates the compiler to know that there's extra payload data, so
things continue working.  However, it still issues pull loads for all
data.  I wanted to separate the two aspects for greater bisectability.

v2: Update for new intel_bufferobj_buffer parameter.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 20:18:30 -07:00
Kenneth Graunke
6834b1ebe3 i965: Pad buffer objects by 2kB in robust contexts to avoid OOB access.
This is an annoyingly big hammer, but it seems less mean than disabling
UBO pushing, and I'm not sure what else to do.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
67fec96452 i965: Stop re-uploading push constants after URB reconfiguration.
Previously we would re-upload the constant data to the batchbuffer,
then re-emit the packets.  We only need to do the last step (causing
the existing data in the batchbuffer to be re-uploaded to the push
constant staging area in the L3).

Now that we've separated the two, it's pretty easy to accomplish.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
29603ae208 i965: Separate uploading push constant data from the pointer packets.
I hope to upload UBO via 3DSTATE_CONSTANT_XS packets, in addition to
normal uniforms.  In order to do that, I'll need to re-emit the packets
when UBOs change.  But I don't want to re-copy the regular uniform data
to the batchbuffer every time.

This patch separates out the data uploading from the packet submission.
We're running low on dirty bits, so I made the new atom happen on every
draw call, and added a flag to stage_state indicating that we want the
packet for that stage emitted.

I would have preferred to do this outside the atom system, but it has
to happen between the uploading of push constant data and the binding
table upload.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
a18ab92d3c i965: Introduce a BRW_NEW_DRAW_CALL dirty bit.
This allows us to have atoms which are signalled on every draw call.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
24891d7c05 i965: Store per-stage push constant BO pointers.
Right now, we always upload new push constant data, and immediately
emit 3DSTATE_CONSTANT_* packets.  We call intel_upload_space and store
the resulting BO pointer in brw->curbe.curbe_bo.  We read that when
emitting the packets.  This works today, but is fragile - it depends on
upload and packet emission being interleaved.

If we instead were to upload all the data, then emit all the packets,
then upload BO wrapping will get us into trouble.  For example, the VS
constants may land in one upload BO, but the FS constants may not fit
and land in a second upload BO.  Uploading FS constants would overwrite
the brw->curbe.curbe_bo pointer, so when we emitted 3DSTATE_CONSTANT_VS,
we'd get the wrong BO.

I intend to separate out this code in a future commit, so I need to fix
this.  To fix it, we simply store a per-stage BO pointer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
6d28c6e52c i965: Select ranges of UBO data to be uploaded as push constants.
This adds a NIR pass that decides which portions of UBOS we should
upload as push constants, rather than pull constants.

v2: Switch to uint16_t for the UBO block number, because we may
    have a lot of them in Vulkan (suggested by Jason).  Add more
    comments about bitfield trickery (requested by Matt).

v3: Skip vec4 stages for now...I haven't finished wiring up support
    in the vec4 backend, and so pushing the data but not using it
    will just be wasteful.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
2a5e4f15ef i965: Require a UBO offset alignment of 32 bytes.
Soon, we're going to start providing UBO data to shaders as push
constants, rather than requiring them to issue pull loads.  The
3DSTATE_CONSTANT_* commands require 32 byte aligned pointers.

So, we need to increase this from 16 to 32.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
8ec5a4e4a4 i965: Switch to absolute addressing for constant buffer 0.
By default, 3DSTATE_CONSTANT_* Constant Buffer 0 is relative to dynamic
state base address.  This makes it unusable for pushing UBOs.  I'd like
to be able to use all four push buffers.

There is a bit in the INSTPM register (or CS_DEBUG_MODE2 on Skylake)
which controls whether buffer 0 is relative to dynamic state base
address, or simply a normal pointer.  Setting that gives us full
flexibility.

We can't currently write this on Haswell and earlier, and will need
to update the kernel command parser, and then do the whole version
checking song and dance.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-13 19:56:49 -07:00
Kenneth Graunke
86bd3fd864 i965: Use async maps for BufferSubData to regions with no valid data.
When writing a region of a buffer via glBufferSubData(), we can write
the data asynchronously if the destination doesn't contain any data.
Even if it's busy, the data was undefined, so the new data is fine too.

Removes all stall avoidance blits on BufferSubData calls in
"Total War: WARHAMMER" on my Skylake GT4.

Decreases the number of stall avoidance blits in Manhattan 3.1:
- Skylake GT4: -18.3544% +/- 6.76483% (n=13)
- Apollolake:  -12.1095% +/- 5.24458% (n=13)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-13 16:58:17 -07:00
Kenneth Graunke
5f223648f2 i965: Track a range of the buffer which contains valid data.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-13 16:58:17 -07:00
Kenneth Graunke
f47612dafb i965: Add a "write" parameter to intel_bufferobj_buffer.
This doesn't do anything yet, but soon we'll want to know whether an
access to a buffer section may write that data, or simply reads it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-13 16:58:17 -07:00
Rafael Antognolli
9a9c7e452b i965: Convert GS_STATE to genxml.
Merge the code with gen6+ 3DSTATE_GS, and delete brw_gs_state.c,
together with brw_gs_unit_state.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
2936388205 i965: Prepare gs_state emitting code to include gen4-5.
Since we always call brw_batch_emit anyways, we can hopefully make things
simpler by calling it only once, and then branching inside its body. This
can be helpful when bringing the gen4-5 code into this function.

Additionally, check for GEN_GEN == 6 instead of < 7 in cases that won't apply
to lower gens.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
9a2cca929f i965: Remove upload_gs_state_for_tf.
This function only emits a particular case of 3DSTATE_GS. Instead, we can do
that inside genX(upload_gs_state), and later reuse part of that code for
emitting gen4-5 state.

There's the additional benefit of allowing us to remove gen6_gs_state.c, which
was only left because of this function.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
ad7663b838 i965: Convert BLEND_CONSTANT_COLOR state to genxml.
It's a very simple conversion, and it allows us to delete brw_cc.c.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
5d48710981 i965: Convert CC state on gen4-5 to genxml.
Use set_blend_entry_bits and set_depth_stencil_bits to fill most of the
color calc struct, and then manually update the rest.

v2:
   - Always check for depth_irb (Ken)
   - Always set Backface Stencil Ref (Ken)
   - Always set alpha reference value (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
b0052aa46f i965: Move color calc code around a bit.
This makes the code more consistent accross generations.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
5246d3eb83 i965: Check for alpha channel just like in gen6+.
gen6+ uses _mesa_base_format_has_channel() to check for the alpha
channel, while gen4-5 use ctx->DrawBuffer->Visual.alphaBits. By using
_mesa_base_format_has_channel() here we keep the same behavior accross
all gen.

While initially both ways of checking the alpha channel seemed correct
to me, this change also seems to fix fbo-blending-formats piglit test on
gen4.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Rafael Antognolli
1d2d3dbc8a i965: Make a helper function for blend entry related state.
Add a helper function to reuse code that fills blend entry related
state, and make genX(upload_blend_state) use it. This function can later
be used by gen4-5 color calc state to set the blend related bits.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 15:39:49 -07:00
Kenneth Graunke
e84cb56f48 i965: Make a helper function for depth/stencil related state.
Gen4-5 basically glue DEPTH_STENCIL_STATE, COLOR_CALC_STATE, and
BLEND_STATE together into a single COLOR_CALC_STATE structure.

By making a helper function, we'll be able to reuse it when filling
out Gen4-5 COLOR_CALC_STATE without replicating any actual logic.

We use generation-defined typedef to handle the polymorphism.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-07-13 15:39:49 -07:00
Lionel Landwerlin
6131a1ae40 aubinator: don't leak fd of opened aubfile
CID: 1373563
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:50 +01:00
Lionel Landwerlin
d1bd731e30 anv: don't use strcpy for copying strings
CID: 1358935
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:47 +01:00
Lionel Landwerlin
226fae7849 intel/compiler: no need to check unsigned is >= 0
CID: 1338342
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:45 +01:00
Lionel Landwerlin
7c4daf8c37 i965: fix missing NULL return if allocation fails
CID: 1250585
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:41 +01:00
Lionel Landwerlin
95c917668c intel/compiler: don't check unsigned is >= 0
CID: 1224468
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:38 +01:00
Lionel Landwerlin
fd8e8fdbfe i965: check pointer before dereferencing it
Check that irb isn't NULL before accessing irb->Base.Base.NumSamples.

CID: 1026046
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:35 +01:00
Lionel Landwerlin
b02d136b5e i965: map_gtt: check mapping address before adding offset
The NULL check might fail if offset isn't 0.

CID: 971379
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:32 +01:00
Lionel Landwerlin
a25a533458 intel/compiler: remove check unsigned is >= 0
By definition unsigned are always >= 0.

CID: 742212
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:29 +01:00
Lionel Landwerlin
19869d6091 isl: use 64bit arithmetic to compute size
If we allow the size to be more than 2^32, then we should compute it
in 64bit arithmetic otherwise we might run into overflow issues.

CID: 1412892, 1412891
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-07-13 22:50:26 +01:00
Connor Abbott
4df93a54f1 nir/lower_io_to_temporaries: don't set compact on shadow vars
The compact flag doesn't make sense on local variables, since the
packing on them is up to the driver. This fixes nir_validate assertions
in some cases, particularly when lower_io_to_temporaries is used on
per-vertex inputs/outputs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-13 14:45:25 -07:00
Connor Abbott
99ff7a9f1f nir: don't segfault when printing variables with no name
While normally we give variables whose name field is NULL a temporary
name when called from nir_print_shader(), when we were calling from
nir_print_instr() we never bothered, meaning that we just segfaulted
when trying to print out instructions with such a variable. Since
nir_print_instr() is meant to be called while debugging, we don't need
to bother too much about giving a consistent name, but we don't want to
crash in the middle of debugging.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-13 14:40:23 -07:00
Jason Ekstrand
add72599d9 i965/urb: Trigger upload_urb on NEW_BLORP
It's a bit rare, but blorp can trigger a urb reconfiguration.  When
that happens, we need to re-upload the URB config.  Previoulsy blorp
would set BRW_NEW_URB_SIZE, but this is a pretty big hammer as it
would cause back-to-black blorp operations to reconfigure both times.
Using BRW_NEW_BLORP is a small, more accurate hammer.

v2 (idr): Sort BRW_NEW_ tokens to match brw_recalculate_urb_fence and
gen6_urb.

v3 (idr): Don't whack BRW_NEW_URB_SIZE in blorp.  Suggested by Jason.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-13 13:41:09 -07:00
Kenneth Graunke
42c64b5f87 mesa: Return GL_INVALID_ENUM for bogus TEXTURE_SRGB_DECODE_EXT params.
Fixes dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.srgb_decode_samplerparameter{f,fv,i,Iiv,Iuiv,iv}.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-07-13 13:00:58 -07:00
Marek Olšák
f33d8af7aa st/dri: add 32-bit RGBX/RGBA formats
Add support for 32-bit RGBX/RGBA formats which are required for Android.

The original patch (commit ccdcf91104) was reverted (commit
c0c6ca40a2) in mesa as it broke GLX resulting in swapped colors. Based
on further investigation by Chad Versace, moving the RGBX/RGBA configs
to the end is enough to prevent breaking GLX.

The handling of RGBA/RGBX in dri_fill_st_visual is a fix from Marek
Olšák.

Cc: Eric Anholt <eric@anholt.net>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-13 14:36:47 -05:00
Eric Anholt
5a9fb2eabc broadcom/vc4: Add more packets to the v2.1 XML.
These will be used to replace vc4_cl_dump.c's hand-written dumping.
2017-07-13 11:30:42 -07:00
Eric Anholt
427bbbb99c broadcom: Introduce a header for talking about chip revisions.
This will be used by the VC5 driver and various shared VC4/VC5 tooling,
like the XML decoder.
2017-07-13 11:28:28 -07:00
Eric Anholt
fd37ce6bec broadcom/genxml: Use the same "gen" attr for HW version as Intel does.
This will let us reuse their tools more easily.
2017-07-13 11:28:28 -07:00
Eric Anholt
ee170c9d83 broadcom/genxml: Support unpacking fixed-point fractional values.
This was an oversight in the original XML support, because unpacking
wasn't used much.  The new XML-based CL dumper will want it, though.
2017-07-13 11:28:28 -07:00
Michel Dänzer
655a32f729 st/mesa: Handle st_framebuffer_create returning NULL
st_framebuffer_create returns NULL if stfbi == NULL or
st_framebuffer_add_renderbuffer returns false for the colour buffer.

Fixes Xorg crashing on startup using glamor on radeonsi.

Fixes: 147d7fb772 ("st/mesa: add a winsys buffers list in st_context")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101775
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-13 09:26:20 -06:00
Tim Rowley
254fa3dbf5 swr/rast: Fix use of KNL-only intrinsics in SKX build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
4c185dd3b3 swr/rast: Fix build warnings when using the Intel compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
bbc3b5c0dc swr/rast: SIMD16 Frontend - Fix USE_SIMD16_FRONTEND build
Previous check-ins without testing with USE_SIMD16_FRONTEND have
introduced regressions. This fixes the build, not the regressions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
640ea4d9a1 swr/rast: Removing unneeded MSVC warning pragma
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
185b37f641 swr/rast: Add support for read-only render targets
Core will ensure hot tiles are loaded for read and write render targets,
and will skip all output merger for read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Tim Rowley
d8ebcad540 swr/rast: Support render target mask instead of render target count
WIP to support read-only render targets.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-13 08:47:10 -05:00
Alejandro Piñeiro
57671025b0 egl: remove unused err variable
Fixes: 81e95924ea ("egl: call _eglError within _eglParseImageAttribList")

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-13 13:36:37 +02:00
Nicolai Hähnle
c22e3c5373 radeonsi/gfx9: fix crash building monolithic merged ES-GS shader
Forwarding from the ES prolog to the ES just barely exceeds the current
maximum array size when 16 vertex attributes are used. Give it a decent
bump to account for merged shaders having up to 32 user SGPRs.

Fixes a crash in GL45-CTS.multi_bind.draw_bind_vertex_buffers.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-13 13:01:15 +02:00
Thomas Hellstrom
81fb154777 loader/dri3: Use dri3_find_back in loader_dri3_swap_buffers_msc
If the application hasn't done any drawing since the last call, we
would reuse the same back buffer which was used for the previous swap,
which may not have completed yet. This could result in various issues
such as tearing or application hangs.

In the normal case, the behaviour is unchanged.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97957
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101683
Cc: mesa-stable@lists.freedesktop.org

[Michel Dänzer: Make Thomas' fix from bugzilla actually work as
 intended, write commit log]
2017-07-13 16:49:28 +09:00
Jason Ekstrand
c3b5c2ca19 i965/screen: Drop get_tiled_height
It's no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
b29c364f0d i965/screen: Use ISL for doing image import checks
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
76f683e91d i965/screen: Use ISL for allocating image BOs
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
5b3363e3f1 intel/isl: Add a helper to convert tilings from ISL to i915
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
a668ba9c18 intel/isl: Add basic modifier introspection
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
285242e674 i965: Add an isl_device to intel_screen
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
06c95f1282 i965/miptree: Move CCS allocation into create_for_dri_image
Any form of CCS on gen9+ only works on Y-tiled images.  The only caller
of create_for_bo which uses Y-tiled BOs is create_for_dri_image.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
b3a44ae7a4 i965: Use create_for_dri_image in intel_update_image_buffer
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
90d93755d1 i965/miptree: Add support for window system images to create_for_dri_image
We want to start using create_for_dri_image for all miptrees created
from __DRIimage, including those which come from a window system.  In
order to allow for fast clears to still work on window system buffers,
we need to allow for creating aux surfaces.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
2dd4e2348f i965/miptree: Add a colorspace parameter to create_for_dri_image
The __DRI_FORMAT enums are all UNORM but we will frequently want sRGB
when creating miptrees for renderbuffers.  This lets us specify.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
14ce44a7bc main/formats: Add a get_linear_format_srgb helper
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
361eb1c6e7 main/formats: Autogenerate _mesa_get_srgb_format_linear
Due to the wonders of autogeneration, this new version covers a few
formats that the old version was missing:

    MESA_FORMAT_SRGB8_ALPHA8_ASTC_3x3x3
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_4x3x3
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_4x4x3
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_4x4x4
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_5x4x4
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_5x5x4
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_5x5x5
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_6x5x5
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_6x6x5
    MESA_FORMAT_SRGB8_ALPHA8_ASTC_6x6x6

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Ben Widawsky
34e1ccbfbe i965/miptree: Allocate mt earlier in update winsys
Later commits require intel_update_image_buffer() to have control over
the miptree creation.   However, intel_update_winsys_renderbuffer_miptree()
currently  creates it based on the given buffer object. This patch moves
the creation to the caller side.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Ben Widawsky
aadd37298c i965/miptree: Add a return for updating of winsys
There is nothing particularly useful to do currently if the update
fails, but there is no point carrying on either. As a result, this has a
behavior change.

v2: Make the return type a bool (Topi)

v3: Don't leak the bo if update_winsys_renderbuffer fails. (Jason)

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
30cfed57ce i965: Use miptree_create_for_dri_image in image_target_renderbuffer_storage
This does make a tiny functional change in that we now also test for
whether or not the format supports texturing and not just rendering.
However, this should have no practical effect as all renderbuffers use
texturable formats.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
091965760d i965/miptree: Set level_x/h in create_for_dri_image
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
4bf140576a i965/miptree: Add tile_x/y to total_width/height
This is what we do in intel_image_target_renderbuffer_storage and it
makes more sense than stomping them.  Because the image gets created as
a 2D image with one miplevel, they should already be equal to the
provided width/height.  Adding the tile offset makes some sense
depending on how you interpret the fields.

The only place these fields are used for in state setup is to set up the
image parameters we pass into shaders.  There may be issues here if you
try to use image_load_store on something pulled in from EGL but that's
probably broken already.  This just makes it consistently broken.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
947b72ab5d i965/miptree: Pass the offset into create_for_bo in create_for_dri_image
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-12 21:15:46 -07:00
Jason Ekstrand
72e7a6b0b6 i965: Move the DRIimage -> miptree code to intel_mipmap_tree.c
This is mostly a direct port.  The only bit of refactoring that was done
was to make creating a planar miptree be an early return from the
non-planar case.  Alternatively, we could have three functions: two
helpers and a main function to just call the right helper.  Making the
planar case an early return seemed cleaner.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-07-12 21:15:46 -07:00
Ilia Mirkin
3645268748 nv50/ir: fix threads calculation for non-compute shaders
We were using the "cp" union fields, which are only valid for compute
shaders. The threads calculation affects the available GPRs, so just
pick a small number for other shader types to avoid limiting available
registers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 22:09:59 -04:00
Ilia Mirkin
87028f8639 freedreno/ir3: fix load_front_face conversion
The comments are correct - we get -1 and 0. However by adding 1, we
convert this into 0,1. This mostly works for conditionals, but when
negated, this will yield the wrong result. Instead just negate the
values (as they are backwards -- -1 means back instead of front).

Fixes tests/shaders/glsl-fs-frontfacing-not.shader_test and
dEQP-GLES3.functional.shaders.builtin_variable.frontfacing on A530.

The latter also tested on A306 by Rob Clark.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-12 19:30:46 -04:00
Alex Smith
0e1886efb9 radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT
If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image
view's descriptor was set to a 2D array (and a few other fields adjusted
accordingly). This is correct when the image view is actually bound as a
storage image, but not when bound as a sampled image. In that case the
type should be set as a cube.

Fix by generating 2 sets of descriptors at view creation time for both
storage and non-storage usage, and then choose between them based on
descriptor type when writing descriptor sets.

v2: Generate storage descriptors for images with TRANSFER_DST, since
    those may be used as storage images internally.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-13 00:21:20 +02:00
Alex Smith
4d5c0c189d radv: Fix possible invalid free of dynamic descriptors
This free was left in after dynamic descriptors were changed to not be
allocated separately from the descriptor set, and can cause a crash.

Fixes: 39644fa40a ("radv: Don't allocate dynamic descriptors separately")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-13 00:21:20 +02:00
Bruce Cherniak
02735e6cf8 swr: Add path to draw directly from client memory without copy.
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block.  This is faster and more
efficient than queuing many large client draws.

Applications that still use large client arrays benefit from this.  VMD
is an example.

The threshold for this path defaults to 32KB.  This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.

v2: Use #define for default value, rather than hard-coded constant.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Bruce Cherniak
1520a06607 swr: Move environment config options into separate function.
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Bruce Cherniak
5bd9554f3d swr: Remove hard-coded constant and "todo" comment.
Removed the hard-coded constant in favor of a #define.  Also removed
TODO comment.  The constant value doesn't need an environment
configurable option.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-07-12 16:56:40 -05:00
Rob Herring
7a7a84c8db Android: Fix vc4 build since XML changes.
Since commit 7f80a9ff13 ("vc4: Introduce XML-based packet header
generation like Intel's."), the vc4 build on Android is broken:

out/target/product/linaro_x86_64/gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10: fatal error: 'v3d_packet_helpers.h' file not found
external/mesa3d/src/gallium/drivers/vc4/vc4_cl_dump.c:28:10: fatal error: 'vc4_packet.h' file not found

The path of the generated header needs to be fixed since we build out of
tree.

Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-07-12 16:47:10 -05:00
Charmaine Lee
147d7fb772 st/mesa: add a winsys buffers list in st_context
Commit a5e733c6b5 fixes the dangling
framebuffer object by unreferencing the window system draw/read buffers
when context is released. However this can prematurely destroy the
resources associated with these window system buffers. The problem is
reproducible with Turbine Demo running with VMware driver. In this case,
the depth buffer content was lost when the context is rebound to a
drawable.

To prevent premature destroy of the resources associated with
window system buffers, this patch maintains a list of these buffers in
the context, making sure the reference counts of these buffers will not
reach zero until the associated framebuffer interface objects no
longer exist. This also helps to avoid unnecessary destruction and
re-construction of the resources associated with the framebuffer.

Fixes VMware bug 1909807.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-11 19:40:17 -07:00
Kenneth Graunke
76acbd07fc i965: Drop bogus pthread_mutex_unlock in map_gtt error path.
The locking was supposed to go away in commit 314647c4c2
(i965: Drop global bufmgr lock from brw_bo_map_* functions.), but
this lone unlock remains.

I'm guessing I messed this up when splitting up Chris's patch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-12 12:39:10 -07:00
Anuj Phogat
0a56c5f3f1 intel/compiler: Don't use opt_sampler_eot() optimization on gen10+
This optimization has been removed on gen10+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-12 11:27:31 -07:00
Eric Anholt
84ed8b67c5 vc4: Set shareable BOs as T tiled if possible
X11 and GL compositor performance on VC4 has been terrible because of our
SHARED-usage buffers all being forced to linear.  This swaps SHARED &&
!LINEAR buffers over to being tiled.

This is an expected win for all GL compositors during rendering (a full
copy of each shared texture per draw call), allows X11 to be used with
decent performance without a GL compositor, and improves X11 windowed
swapbuffers performance as well.  It also halves the memory usage of
shared buffers that get textured from.  The only cost should be idle
systems with a scanout-only buffer that isn't flagged as LINEAR, in which
case the memory bandwidth cost of scanout goes up ~25%.

This implements the EGL_EXT_image_dma_buf_import_modifiers extension,
supporting the VC4 T_TILED modifier.

v2: Added modifier support to resource creation/import, and
    advertisement (by daniels).
v3: Fix old-kernel fallback path, fix compiler error and warnings, and
    comment touchups (by anholt).

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
bb466a996f vc4: Use vc4_setup_slices for resource import
Rather than open-coding populating the first slice inside resource
import, use vc4_setup_slices to do it for us.

v2: Rebase on VC4_DEBUG=surf change

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
111b6b77cb vc4: Make the miptree debug code available under VC4_DEBUG=surf
I kept flipping the bool on for debug, so let's just make it available.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
a2d87a0019 vc4: Switch back to using a local copy of vc4_drm.h.
Needing to get our uapi header from libdrm has only complicated things.
Follow intel's lead and drop our requirement for it.

Generated from the same commit mentioned in the README.

v2: Update Android.mk as well, move vc4_drm.h reference for distcheck.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-07-12 10:58:33 -07:00
Eric Anholt
5d6271c6a5 intel: Move the DRM uapi headers to a non-Intel location.
I want to remove vc4's dependency on headers from libdrm as well, but
storing multiple copies of drm_fourcc.h in our tree would be silly.

v2: Update Android.mk as well, move distcheck drm*.h references to
    top-level noinst_HEADERS.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Rob Herring <robh@kernel.org>
2017-07-12 10:58:33 -07:00
Eric Anholt
2aec62a45b vc4: Remove a stale comment.
The kernel hasn't been synchronous in a couple of years, plus there was
synchronization code right there.
2017-07-12 10:58:33 -07:00
Jason Ekstrand
8e3d9c5d09 anv: Round u_vector element sizes to a power of two
This fixes 32-bit builds of the driver.  Commit 08413a81b9
changed things so that we now put struct anv_states in the u_vector for
binding tables.  On 64-bit builds, sizeof(struct anv_state) is a power
of two but it isn't on 32-bit builds.

Fixes: 08413a81b9
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-07-12 10:34:13 -07:00
Brian Paul
5e5f251db1 svga: whitespace, formatting fixes in svga_swtnl_backend.c 2017-07-12 10:58:14 -06:00
Brian Paul
f2b59f6c02 svga: whitespace, formatting fixes in svga_swtnl_draw.c 2017-07-12 10:58:14 -06:00
Brian Paul
183d4193b8 svga: whitespace, formatting fixes in svga_swtnl_state.c 2017-07-12 10:58:13 -06:00
Brian Paul
f62bc96dd6 svga: move comment, declaration in svga_init_shader_key_common()
put the comment before the relevant code.  Move declaration of
swizzle_tab var to where it's used.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-12 10:58:08 -06:00
Brian Paul
33eedd081e draw: whitespace, formatting fixes in draw_vs_exec.c
Trivial.
2017-07-12 10:58:07 -06:00
Brian Paul
8871c3ccf6 draw: s/unsigned/enum tgsi_semantic/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-12 10:58:02 -06:00
Emil Velikov
459274144d travis: lower SWR requirement to GCC 4.8, aka std=c++11
With ealier commit we relaxed the requirement from C++14 to C++11.
Update the build script so that it

Cc: Tim Rowley <timothy.o.rowley@intel.com
Fixes: 0b80b02502 ("swr: relax c++ requirement from c++14 to c++11")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-12 15:46:25 +01:00
Emil Velikov
432f8bff5a docs: update HTTP -> HTTPS reference to reflect reality
The link recently got updated to https.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:45:30 +01:00
Emil Velikov
4506a74cc6 egl: set KHR_gl_texture_3D_image only when the requirements are met.
DRI_IMAGE's createImageFromTexture is used to implement the extension,
so we should check for it prior to advertising.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:45:27 +01:00
Emil Velikov
962110fa57 egl: enhance KHR_gl_image extensions checks
Drop the (duplicate) top-level check in dri2_create_image_khr() and add
the respective checks in dri2_create_image_khr_{texture,renderbuffer}

v2: use unreachable instead of assert in dri2_create_image_khr_texture

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:44:26 +01:00
Emil Velikov
a2ae8e6076 egl: don't set modifier if no modifiers are available
If no modifiers are available, the variable will never be used. Thus
there's no point in initialising it.

Cc: Varad Gautam <varad.gautam@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:43:15 +01:00
Emil Velikov
4d8191fd00 egl: check for extensions' presence during attr parsing
If the respective extension is not supported, one should return
EGL_BAD_PARAMETER as mentioned in earlier commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:43:12 +01:00
Emil Velikov
cd859452e9 egl: add width/height as EXT_image_dma_buf_import attrs
Although not listed amongst the initial EGL_LINUX_DRM_FOURCC_EXT and
friends list, the spec reads

   ... Required attributes and their values are as
   follows:

    * EGL_WIDTH & EGL_HEIGHT: The logical dimensions of the buffer in pixels

    * EGL_LINUX_DRM_FOURCC_EXT: The pixel format of the buffer, as specified
      by drm_fourcc.h and used as the pixel_format parameter of the
      drm_mode_fb_cmd2 ioctl.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:43:09 +01:00
Emil Velikov
d13dcca2c2 egl: polish EXT_image_dma_buf_import attr parsing
Simplify the existing if/else + temporary variable into if (foo) return
X.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:43:05 +01:00
Emil Velikov
448f70e366 egl: simplify EXT_image_dma_buf_import_modifiers attr parsing
Move the common extension check at the top.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:42:59 +01:00
Emil Velikov
3ee2be4113 egl: split _eglParseImageAttribList into per extension functions
Will allow us to simplify existing code and make further improvements
short and simple.

No functional change intended.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:42:54 +01:00
Emil Velikov
81e95924ea egl: call _eglError within _eglParseImageAttribList
As per EGL_KHR_image_base:

   If an attribute specified in <attrib_list> is not one of the
   attributes listed in Table bbb, the error EGL_BAD_PARAMETER is
   generated.

We should set the error as opposed to simply log it.

Currently we have a partial solution, whereby only some of the callers
call _eglError().

Since that has proven to be less robust, simply set the error by the
function itself and change the return type to EGLBoolean, updating the
callers.

So now the code is slightly simpler. Plus the follow-up fixes will be
easier to manage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:42:51 +01:00
Emil Velikov
9365ff4b88 egl: move eglCreateDRMImageMESA's malloc later
Don't bother allocating any memory until we're finished parsing and
sanitising all the attributes.

As a nice side effect we now consistently set eglError when any of
the attrib/values are not correct.

Strangely enough the spec does not mention _anything_ about what error
should be set where, even if the implementation already sets the odd
one.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-12 15:42:03 +01:00
Brian Paul
f7e78abdf4 svga: fix texture swizzle writemasking
Commit bfe1e7737a changed how texture swizzles are set up.
This exposed a latent bug in the VMware driver: we were ignoring
the texture instruction's writemask when applying the 0 and 1
swizzle terms.

This wasn't caught by the Piglit texture swizzle test because it
only exercises fixed function (no write masking).

Fixes issues seen with ETQW apitrace.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-11 15:43:36 -06:00
Chris Wilson
cead51a0c6 i965: Use VALGRIND_MAKE_MEM_x in place of MALLOCLIKE/FREELIKE
Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the
exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block
inaccessible, but still leaves it defined in its allocation tracker i.e.
it will report the mmap as lost despite the call to FREELIKE!

Instead of treating the mmap as an allocation, treat it as changing the
access bits upon the memory, i.e. that it becomes defined (because of
the buffer objects always contain valid content from the user's
perspective) upon mmap and inaccessible upon munmap. This makes memcheck
happy without leaving it thinking there is a very large leak.

Finally for consistency, we treat all the mmap/munmap paths the same
even though valgrind can intercept the regular mmap used for GTT. We
could move this in the drm_mmap/drm_munmap macros, but that quickly
looks ugly given the desire for those to support different OSes, but I
didn't try that hard!

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-11 14:07:47 -07:00
Kenneth Graunke
314879f7fe i965: Fix asynchronous mappings on !LLC platforms.
When using a read-only CPU mapping, we may encounter stale buffer
contents.  For example, the Piglit primitive-restart test offers the
following scenario:

   1. Read data via a CPU map.
   2. Destroy that buffer.
   3. Create a new buffer - obtaining the same one via the BO cache.
   4. Call BufferSubData, which does a GTT map with MAP_WRITE | MAP_ASYNC.
      (We avoid set_domain for async mappings, so no flushing occurs.)
   5. Read data via a CPU map.
      (Without explicit clflushing, this will contain data from step 1!)

Otherwise, everything ought to work, keeping in mind that we never use
CPU maps for writing - just read-only CPU maps.

This restores the performance gains after Matt's revert in commit
71651b3139.

v2: Do the invalidate later, and even when asking for a brand new map.
v3: Add more comments from Chris.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-11 13:26:53 -07:00
Kenneth Graunke
20104f1926 i965: Don't use PREAD for glGetBufferSubData().
Just map the buffer and memcpy.  This will do a CPU mmap, which should
be reasonably efficient, and doing this gives us full control over the
domains and caching instead of leaving it to the kernel.

This prevents regressions on Braswell in the next commit.  Specifically
GL45-CTS.shader_atomic_counters.basic-buffer-operations.  Because async
maps start skipping set-domain, the pread thought everything was nicely
still in the CPU domain, and returned stale data.

v2: Use _mesa_error_no_memory() if the map fails instead of crashing.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-11 13:26:46 -07:00
Tim Rowley
f50aa21456 swr: build driver proper separate from rasterizer
swr used to build and link the rasterizer to the driver, and to support
multiple architectures we needed to have multiple versions of the
driver/rasterizer combination, which needed to link in much of mesa.

Changing to having one instance of the driver and just building
architecture specific versions of the rasterizer gives a large reduction
in disk space.

libGL.so        6464 Kb ->  7000 Kb
libswrAVX.so   10068 Kb ->  5432 Kb
libswrAVX2.so   9828 Kb ->  5200 Kb

Total          26360 Kb -> 17632 Kb

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-11 13:38:20 -05:00
Tim Rowley
50cd222116 swr: switch to using SwrGetInterface api table
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-11 13:38:20 -05:00
George Kyriazis
27c5568de3 swr/rast: make SWR_VISIBLE attribute work for windows
Needed to expose SwrGetInterface

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-11 13:37:57 -05:00
Lionel Landwerlin
9d681a7a18 i965: perf: use new subslices numbers from device info
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-07-11 16:14:57 +01:00
Lionel Landwerlin
384aaa4d3f intel: add number of subslices to device info
We could have used a single integer to store that value, but
Cannonlake has different number of subslices per slice depending on
the GT.

v2: Add CFL subslice numbers (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-07-11 16:14:57 +01:00
Ben Widawsky
25c1a7cc7a i965: Use already existing eu_total
Reduces IOCTL calls by 1, and provides a centralized place to override
such configurations if we have a need to do so.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-11 16:14:57 +01:00
Chris Wilson
618be8cc1a i965: Resolve framebuffers before signaling the fence
From KHR_fence_sync:

  When the condition of the sync object is satisfied by the fence
  command, the sync is signaled by the associated client API context,
  causing any eglClientWaitSyncKHR commands (see below) blocking on
  <sync> to unblock. The only condition currently supported is
  EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR, which is satisfied by
  completion of the fence command corresponding to the sync object,
  and all preceding commands in the associated client API context's
  command stream. The sync object will not be signaled until all
  effects from these commands on the client API's internal and
  framebuffer state are fully realized. No other state is affected by
  execution of the fence command.

If clients are passing the fence fd (from EGL_ANDROID_native_fence_sync)
to a compositor, that fence must only be signaled once the framebuffer
is resolved and not before as is currently the case.

v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad)

Reported-by: Sergi Granell <xerpi.g.12@gmail.com>
Fixes: c636284ee8 ("i965/sync: Implement DRI2_Fence extension")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sergi Granell <xerpi.g.12@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Chad Versace <chadversary@chromium.org>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-07-11 15:46:58 +01:00
Brian Paul
bf7a4f4441 svga: s/unsigned/enum tgsi_texture_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-07-11 08:09:14 -06:00
Brian Paul
1d82674969 svga: s/unsigned/enum tgsi_swizzle
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-07-11 08:09:14 -06:00
Brian Paul
3effacf172 svga: s/unsigned/enum tgsi_interpolate_mode/
And s/unsigned/enum tgsi_interpolate_loc/

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-07-11 08:09:14 -06:00
Brian Paul
9330112b35 svga: s/unsigned/enum tgsi_file_type/
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-07-11 08:09:14 -06:00
Brian Paul
1b5e88becd svga: s/unsigned/enum tgsi_semantic/
Makes gdb debugging a little nicer.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-07-11 08:09:14 -06:00
Kenneth Graunke
7250cbafb9 i965: Assert that we don't use CPU write maps to non-coherent buffers.
Using CPU maps of non-coherent buffers can get us in a lot of trouble,
and WC maps are a reasonable alternative anyway.  Guard against shooting
ourselves in the foot by adding an assert, and comment.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:34 -07:00
Chris Wilson
de4c2eaa62 i965: Disable access to CPU mmap for async access on non-LLC machines
If the user triggers an implicit batch flush while holding access to a
CPU mapped buffer, that mmapping will be invalidated by the kernel for
non-LLC devices. (The kernel when executing a batch will change the
cache domain of the buffers in that batch, which for non-LLC CPU access
will cause that buffer to be clflushed and any further CPU access to be
discarded.) To prevent this, simply disallow any CPU async mmap access.
The cases where async CPU access to a non-LLC buffer should continue to
be allowed via their preferred snooping path.

v2 (Ken): Reword the comment slightly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:31 -07:00
Chris Wilson
b532e3b4a2 i965: Track when a bo is shared with an external client
If the buffer is being shared with an external client, our own state
tracking may be stale and in some cases we may wish to double check with
the kernel/hw state. At the moment, this is synonymous with not being
reusable, but the semantics between reusable and external are quite
different and we will have more examples of non-reusable buffers in the
near future.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:30 -07:00
Kenneth Graunke
c2c37f5185 intel: Fix clflushing on modern (Baytrail+) Atom CPUs.
Thanks to Chris Wilson for pointing this out.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-07-10 15:55:26 -07:00
Kenneth Graunke
3e50607a40 intel: Move clflush helpers from anv to common/gen_clflush.h.
I want to use these in the OpenGL driver as well.

v2: Add to COMMON_FILES in Makefile.sources (caught by Emil)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-10 15:55:19 -07:00
James Legg
b117f59710 spirv: Fix reaching unreachable for compare exchange on images
We were hitting the
	unreachable("Invalid image opcode")
near the end of vtn_handle_image when parsing the
SpvOpAtomicCompareExchange opcode.

v2: Add stable CC.
v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel
capability which is not exposed in Vulkan, and spirv_to_nir is not used
for OpenCL which does support it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
CC: <mesa-stable@lists.freedesktop.org>
2017-07-10 14:13:37 -07:00
Marek Olšák
aaee0d1bbf gallium: use "ull" number suffix to keep the QtCreator parser happy
It can't parse "llu".

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-10 22:44:48 +02:00
Chris Wilson
833108ac14 i965: Use brw_bo_wait() for brw_bo_wait_rendering()
Currently, we use set_domain() to cause a stall on rendering. But the
set-domain ioctl has the side-effect of changing the kernel's cache
domain underneath the struct_mutex, which may perturb state if there was
no rendering to wait upon and in general is much heavier than the
lockless wait-ioctl. Historically libdrm used set-domain as we did not
have an explicit wait-ioctl (and the patches to teach it to use wait if
available were lost in the mists). Since mesa already depends upon a
kernel support the wait-ioctl, we do not need to supply a fallback.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-10 11:18:08 -07:00
Brian Paul
3b28eaabf6 svga: fix PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE value
This query is supposed to return the max texture buffer size/width in
texels, not size in bytes.  Divide by 16 (the largest format size) to
return texels.

Fixes Piglit arb_texture_buffer_object-max-size test.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by :Charmaine Lee <charmainel@vmware.com>
2017-07-10 11:11:26 -06:00
Brian Paul
f8f71cb6f3 svga: fix breakage in create_backed_surface_view()
This fixes a regression in some piglit tests since commit 5e5d5f1a2e.
I think I mis-resolved the merge conflict when cherry-picking that
commit to master.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-10 11:11:26 -06:00
Jason Ekstrand
781263486f anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITE
The reason we were doing this was to ensure that the kernel did the
appropriate cross-ring synchronization and flushing.  However, the
kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to
insert a fence.  It only cares about the domain for determining whether
or not it needs to clflush the BO before using it for scanout but the
domain automatically gets set to RENDER internally by the kernel if
EXEC_OBJECT_WRITE is set.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-07-10 08:55:47 -07:00
Ilia Mirkin
6c7b7aa3d8 a5xx: fix condition for updating *_FS_OUTPUT_CNTL
The register values depend on the currently set program, so make sure to
revalidate when the program changes.

Fixes glsl-1.10-fragdepth as well as
dEQP-GLES3.functional.shaders.fragdepth.compare.*

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-07-09 18:36:13 -04:00
Dave Airlie
7b5f2e0070 radv/ac: drop setting xnack
Since radv uses compute rings and we can't know when we are setting
up the shaders what ring they are to be used on, we should just use
the default xnack setting. This may be suboptimal in some places,
but if we hit a problem, we likely should try and address this
between llvm and mesa.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-09 22:21:43 +01:00
Dave Airlie
edf2acbeb1 radv: add support for using addrlib max alignment.
Rather than using 64k, use what addrlib returns as the base
alignment for vulkan allocations.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-09 22:17:59 +01:00
Ilia Mirkin
f3958f1644 nir: copy front interpolation when creating fake back color input
Fixes a bunch of gl_BackColor interpolation tests that had explicit
interpolation specified on the fragment shader gl_Color.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-07-08 21:27:44 -04:00
Ilia Mirkin
ce3e2ec3b7 a5xx: remove no-longer-accurate border color layout comment
Better to just point at the bcolor_entry struct which has our current
understanding encoded into it. Also add an assert to ensure that the
struct remains the expected size.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Ilia Mirkin
4ad4009473 a5xx: fix border color for depth formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Ilia Mirkin
cf173b5dcd a5xx: add border color clamping, add packed border color formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Ilia Mirkin
a9b58a00bb a5xx: fix border colors for swizzled texture formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Ilia Mirkin
a4eeb0c403 a5xx: fix integer texture border colors
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Ilia Mirkin
1acc101b3f a5xx: fix primitive restart
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-08 21:14:58 -04:00
Andres Gomez
e6b189351f nir/spirv: Remove unnecessary comment.
It should have been removed after 00c47e111c.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-08 21:14:19 +03:00
Bas Nieuwenhuizen
1aba0e7f58 radv: Add compute htile clear for combined depth+stencil surfaces.
Figured out the clear value when we have a combined depth stencil
surface.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-08 16:11:29 +02:00
Roland Scheidegger
4db72852a1 draw: handle more TGSI_SEMANTIC_COLOR indices
It could only handle indices 0/1, otherwise what happened was bad (accessing
array out of bounds, no crash but kind of random). This is enough for the gl
state tracker (primary/secondary color) but not enough for some other state
trackers (d3d9 has no limits on the number of color interpolants).
The complexity with color semantics are all due to the front/back mapping (2
outputs in the vs map to one input in the fs) so this isn't extended to
indices > 1 - d3d9 has no use for back colors, therefore this isn't needed and
still only 2 back colors can be handled correctly.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-08 06:02:18 +02:00
Matias N. Goldberg
f728435e1f st/mesa: Fix grabbing the wrong variant if glDrawPixels is called
By design pixel shaders can have up to 3 variants:
* The standard one.
* glDrawPixels variant.
* glBitmap variant.
However "shader_has_one_variant" ignores this fact, and therefore
st_update_fp would select the wrong variant if glDrawPixels or glBitmap
was ever called.

This patch fixes the problem. If the standard variant has been created,
calling glDrawPixels or glBitmap will append the variant to the second
entry of the linked list, so that st_update_fp still selects the right
one if shader_has_one_variant is set.

If the standard variant hasn't been created yet and glDrawPixel/Bitmap
has been called, st_update_fp will will see this and take the slow path
instead. The standard variant will then be added at the front of the
linked list, so that the next time the fast path is taken.

Blender in particular is hit by this bug.

v2: Marek - cosmetic changes

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=101596

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-07-08 01:44:51 +02:00
Nanley Chery
753a7bbc84 Revert "intel/isl: Only create a CCS buffer if the image supports rendering"
This reverts commit 8aaa13467d, which was
based on an incorrect assumption. Unlike the restriction placed on image
views in the Vulkan API, OpenGL allows you to render to texture views
whose formats differ from the originals.

Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
2017-07-07 14:24:58 -07:00
Brian Paul
9ac55e8219 mesa: finish implementing glPrimitiveRestartNV() for display lists
If we try to build a display list with just a glPrimitiveRestartNV()
call, we'd crash because of a null GLvertexformat::PrimitiveRestartNV
pointer.  This change fixes that case.

The previous patch fixed the case of calling glPrimitiveRestartNV()
inside a glBegin/End pair.

v2: minor clean-up in save_PrimitiveRestartNV(), per Charmaine.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:22:46 -06:00
Olivier Lauffenburger
f5c8bb1e00 vbo: fix glPrimitiveRestartNV crash inside a display list
glPrimitiveRestartNV crashes when it is called during the compilation
of a display list.

There are two reasons:
- ctx->Driver.CurrentSavePrimitive is not set to the current primitive
- save_PrimitiveRestartNV() calls _save_Begin() which only sets an
  OpenGL error, instead of calling vbo_save_NotifyBegin().

This patch correctly calls vbo_save_NotifyBegin() but it detects
the current primitive mode by looking at the latest saved primitive.

Additional work by Brian Paul

Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101464
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:22:34 -06:00
Brian Paul
1d0bdfb56d st/mesa: remove unused st_framebuffer::Private field
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:04:58 -06:00
Brian Paul
1b3cbcc7be mesa: add some braces in _mesa_make_current()
Slightly better readability.
2017-07-07 12:04:34 -06:00
Brian Paul
960aa95df6 vbo: rename target->index in loopback code
Because it's a vertex attribute index.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:04:34 -06:00
Brian Paul
650b3c8756 vbo: whitespace/formatting fixes in vbo_save_loopback.c
Trivial.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:04:34 -06:00
Brian Paul
7022e298b0 vbo: simplify vbo_save_NotifyBegin()
This function always returned GL_TRUE.  Just make it a void function.
Remove unreachable code following the call to vbo_save_NotifyBegin()
in save_Begin() in dlist.c

There were some stale comments that no longer applied since an earlier
code refactoring.

No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:04:34 -06:00
Brian Paul
5b8d33acef svga: adjust line subpixel position for HWv8
This fixes two regressions on HWv8:
  Piglit gl-1.0-ortho-pos
  Piglit/glean fbo
This was caused by commit c2b92dada0 "svga: clamp device line width
to at least 1 to fix HWv8 line stippling"

This also fixes two conform tests: Vertex Order and Polygon Face

No Piglit/conform changes with HWv9 or later.

VMware bug 1905053

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-07 12:04:17 -06:00
Aleksander Morgado
5d8514de14 etnaviv: fix refcnt initialization in etna_screen
Despite being a member of the etna_screen struct, 'refcnt' is used by
the winsys-specific logic to track the reference count of the object
managed in a hash table. When the count reaches zero, the pipe screen
is removed from the table and destroyed.

Fix the logic by initializing the refcnt to 1 when screen created.
This initialization is done in etna_screen_create(), to follow the
same logic as in freedreno and virgl.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-07-07 15:39:29 +02:00
Ilia Mirkin
c036122646 a5xx: add support for rendering to RGB10A2_UNORM formats
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-07 09:09:48 -04:00
Ilia Mirkin
a00727ab25 a5xx: set uint/sint bits for mrt output register
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-07 09:09:48 -04:00
Ilia Mirkin
e803023614 a5xx: add backface stencil emission
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-07 09:09:48 -04:00
Samuel Pitoiset
a584a12308 radeonsi: fix invalidating bindless buffer descriptors
The VA is stored at [4:5], not [0:1]. This invalidated all
texture buffer descriptors when they were made resident in
the current context.

This removes few partial flushes and cache invalidations which
are needed when updating a bindless descriptor on the fly with
a WRITE_DATA packet.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-07 09:09:39 +02:00
Olivier Lauffenburger
43dea65ad2 st/wgl: Implement wglUseFontBitmaps.
wglUseFontBitmaps is currently a noop.
This patch implements this function for Windows.
Misc code clean-ups by Brian.

Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-06 17:26:05 -06:00
Olivier Lauffenburger
80c6598cdb st/wgl: improve selection of pixel format
Current selection of pixel format does not enforce the request of
stencil or depth buffer if the color depth is not the same as
requested.

For instance, GLUT requests a 32-bit color buffer with an 8-bit
stencil buffer, but because color buffers are only 24-bit, no
priority is given to creating a stencil buffer.

This patch gives more priority to the creation of requested buffers
and less priority to the difference in bit depth.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101703
Signed-off-by: Olivier Lauffenburger <o.lauffenburger@topsolid.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-06 17:25:58 -06:00
Alex Smith
c2a5cb6427 ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics
The NIR parameters are ordered "compare, data", matching GLSL, but both
the image and buffer LLVM intrinsics take them the other way around.
This is already handled correctly for SSBO atomics.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-07-07 00:57:25 +02:00
Brian Paul
7cc6ee56c6 mesa: simplify get_tex_images_for_clear()
Get rid of redundant code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-06 16:33:57 -06:00
Brian Paul
f3a608d9f9 mesa: new comments, assertion related to glClearTexSubImage
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-06 16:33:57 -06:00
Brian Paul
ccdcce3638 st/mesa: find proper mipmap level in st_ClearTexSubImage()
The Piglit arb_clear_texture-error test creates a texture with only
a 1x1 image at level=1, then tries to clear level 0 (nonexistent)
and level 1 (exists).  The test only checks that the former generates
an error but the later doesn't.  The test passes, but when we try
to clear the level=1 image we're passing an invalid level to
pipe_context::clear_texture().  level=1, but since there's only one
mipmap level in the texture, it should be zero.

This fixes the code to search the gallium texture resource for the
correct mipmap level.  Also, add an assertion to make sure we're not
passing an invalid level to pipe_context::clear_texture().

Fixes device errors with VMware driver.  No Piglit regressions.

v2: don't do the level search when using immutable textures.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-06 16:33:57 -06:00
Brian Paul
7e2669d02f st/mesa: whitespace fixes in st_cb_fbo.c
Trivial.
2017-07-06 16:33:57 -06:00
Brian Paul
dd50663e7b st/mesa: whitespace fixes in st_texture.c
Trivial.
2017-07-06 16:33:57 -06:00
Dave Airlie
8950fac6ab radv: don't overallocate depth/stencil formats
For depth/stencil formats the surface layer allocates the
stencil separately, so we don't need to include it in the
bpe.

This reduces the side of d32s8 allocates to something closer to pro.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:23:22 +01:00
Dave Airlie
09d7c7be4f radv: enable sisched toggle in perftest flags.
RADV_PERFTEST=sisched

to enable it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:07:49 +01:00
Dave Airlie
d97275e42c ac/llvm: set xnack like radeonsi does.
Use family, but only set xnack+ for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:07:45 +01:00
Dave Airlie
01e958d631 ac/llvm: create features list using snprintf.
Just more moving code around before adding things to it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:06:04 +01:00
Dave Airlie
9d9f051390 ac/radv: change api to create target machine
This just modifies the API to make it easier to add other flags
to target machine creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 23:05:59 +01:00
Eric Engestrom
076faf8764 build systems: move git_sha1_gen.sh to bin/
There was no reason for this script to live outside the scripts
directory.

Suggested-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-07-06 22:59:39 +01:00
Tim Rowley
bab03c06fc swr/rast: Correctly allocate SWR_STATS memory as cacheline aligned
Cacheline alignment of SWR_STATS to prevent sharing of cachelines
between threads (performance).

Gets rid of gcc-7.1 warning about using c++17's over-aligned new
feature.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-06 15:01:00 -05:00
Tim Rowley
1f0680b51e swr/rast: remove unused variables
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-06 15:00:54 -05:00
Tim Rowley
d50ef7332c swr/rast: don't use _mm256_fmsub_ps in AVX code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-06 15:00:48 -05:00
Tim Rowley
f0a22956be swr/rast: _mm*_undefined_* implementations for gcc<4.9
Define these in terms of setzero for ancient gcc versions which don't
have the undefined intrinsics.

Cc: mesa-stable@lists.freedesktop.org

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-07-06 15:00:28 -05:00
Aleksander Morgado
a6893a50c8 etnaviv: don't dereference etna_resource pointer if allocation fails
The check for the pointer being non-NULL was being done too late.

Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-07-06 21:06:25 +02:00
Vinson Lee
c5d0dc7fa5 scons: Check for xlocale.h before defining HAVE_XLOCALE_H.
Don't assume the header is present on some platforms - use the more
robust CheckHeader() instead.

glibc 2.26 removed xlocale.h.
https://sourceware.org/glibc/wiki/Release/2.26#Removal_of_.27xlocale.h.27

Fix this build error with glibc 2.26.

  Compiling src/util/strtod.c ...
src/util/strtod.c:32:10: fatal error: xlocale.h: No such file or directory
 #include <xlocale.h>
          ^~~~~~~~~~~

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101657
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-07-06 09:51:28 -07:00
Dave Airlie
a6c2001ace radv: add support for cmd predication.
This doesn't get used yet, it just adds support to various PKT3
emissions to enable it later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 02:06:49 +01:00
Ilia Mirkin
880f21f55d glsl: check if any of the named builtins are available first
_mesa_glsl_has_builtin_function is used to determine whether any variant
of a builtin are available, for the purpose of enforcing the GLSL ES
3.00+ rule that overloads or overrides of builtins are disallowed.

However the builtin_builder contains information on all builtins,
irrespective of parse state, or versions, or extension enablement. As a
result we would say that a builtin existed even if it was not actually
available.

To resolve this, first check if at least one signature is available for
a builtin before returning true.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101666
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-05 20:05:53 -04:00
Jason Ekstrand
ab1939aea8 nir/spirv: Rework function argument setup
Now that we have proper pointer types, we can be more sensible about the
way we set up function arguments and deal with the two cases of pointer
vs. SSA parameters distinctly.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:56 -07:00
Jason Ekstrand
0bdc622d43 nir/spirv: Stop trying to convert pointers to SSA in glsl450
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:55 -07:00
Jason Ekstrand
849bfc85c9 nir/spirv: Use real pointer types
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:55 -07:00
Jason Ekstrand
ca62e849d3 nir/spirv: Stop using glsl_type for function types
We're going to want the full vtn_type available to us anyway at which
point glsl_type isn't really buying us anything.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:55 -07:00
Jason Ekstrand
96f2439858 nir/spirv: Beef up the type system a bit
This adds a vtn concept of base_type as well as a couple of other
fields.  This lets us be a tiny bit more efficient in some cases but,
more importantly, it will eventually let us express things the GLSL type
system can't.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:55 -07:00
Jason Ekstrand
ad4519696d nir/spirv: Compact vtn_type
Use an anonymous union of structs to help keep the structure small and
better organized.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:54 -07:00
Jason Ekstrand
55da2cfba2 nir/spirv: Simplify type copying
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:54 -07:00
Jason Ekstrand
62ebca1fe6 nir/spirv: Compute offsets for UBOs and SSBOs up-front
Now that we have a pointer wrapper class, we can create offsets for UBOs
and SSBOs up-front instead of waiting until we have the full access
chain.  For push constants, we still use the old mechanism because it
provides us with some nice range information.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:54 -07:00
Jason Ekstrand
604eda3712 nir/spirv: Rework the way pointers get dereferenced
This has the advantage of moving all of the "extend an access chain"
code into one place.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:54 -07:00
Jason Ekstrand
4c21e6b7f8 nir/spirv: Break variable creation out into a helper
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:53 -07:00
Jason Ekstrand
2e92d6a392 nir/spirv: Remove unneeded parameters from pointer_to_offset
Everyone now calls it with stop_at_matrix = false.  Since we're now
always walking all the way to the end of the access chain, the type
returned is just the same as ptr->type;

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:53 -07:00
Jason Ekstrand
6d30f33307 nir/spirv: Simplify matrix loads/stores
Instead of handling all of the complexity at the end, we choose to
decorate types a bit more cleverly.  When we have a row-major matrix
type, we give it the stride of a single vector and give it's array
element type (which represents a column) the actual matrix stride.

Previously, we were using stop_at_matrix and handling everything from
matrix on down as special cases but now we walk the access chain all the
way to the end and then load.  Even though this looks like it may lead
to a significant functional change, it doesn't.  The reason why we
needed to do stop_at_matrix before was to handle row-major properly
since the offsets and strides would be all out-of-order.  Now that row
major matrix types have the small stride on the matrix and the large
stride on the vector, offsetting to a single column of a row-major
matrix works fine.  The load/store code simply picks up on the fact that
the stride isn't the type size and does multiple loads.  The generated
code from these methods should be the same.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:53 -07:00
Jason Ekstrand
00c47e111c nir/spirv: Use the correct stride for non-32-bit vectors
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:53 -07:00
Jason Ekstrand
415e198d48 nir/spirv: Wrap access chains in a new vtn_pointer data structure
The vtn_pointer structure provides a bit better abstraction than passing
access chains around directly.  For one thing, if the pointer just
points to a variable, we don't need the access chain at all.  Also,
pointers know what their dereferenced type is so we can avoid passing
the type in a bunch of places.  Finally, pointers can, in theory, be
extended to the case where you don't actually know what variable is
being referenced.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:52 -07:00
Jason Ekstrand
06b5eeda17 nir/spirv: Rename some things from access_chain to pointer
We're about to add a vtn_pointer data structure and this will prevent
some rename churn in the next commit.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:52 -07:00
Jason Ekstrand
4e0280d37d nir/spirv: Split up Uniform and UniformConstant storage classes
We were originally handling them together because I was rather unclear
on the distinction.  However, keeping them combined keeps the confusion.
Split them up so that it's more clear from the code how we expect the
two storage classes to be used.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:52 -07:00
Jason Ekstrand
32a60dbef3 nir/spirv: Add a storage_class_to_mode helper
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:52 -07:00
Jason Ekstrand
a10d887ad1 nir/spirv: Use the type from the deref for atomics
Previously, we were using the type of the variable which is incorrect.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:51 -07:00
Jason Ekstrand
cc577ca377 nir/spirv: Move a "}"
It's closing a "{" at the begining of a switch case.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-07-05 15:26:51 -07:00
Jason Ekstrand
0673bbfd9b i965: Move surface resolves back to draw/dispatch time
This is effectively a revert of 388f02729b
though much code has been added since.  Kristian initially moved it to
try and avoid locking problems with meta-based resolves.  Now that meta
is gone from the resolve path (for good this time, we hope), we can move
it back.  The problem with having it in intel_update_state was that the
UpdateState hook gets called by core mesa directly and all sorts of
things will cause a UpdateState to get called which may trigger resolves
at inopportune times.  In particular, it gets called by _mesa_Clear and,
if we have a HiZ buffer in the INVALID_AUX state, causes a HiZ resolve
right before the clear which is pointless.  By moving it back to
try_draw_prims time, we know it will only get called right before a draw
which is where we want it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-05 14:22:40 -07:00
Vinson Lee
95731b7ccc mesa: Avoid set comprehension.
Fix build error on CentOS 6.9 with Python 2.6.

  GEN    main/format_fallback.c
  File "./main/format_fallback.py", line 42
    names = {fmt.name for fmt in formats}
                        ^
SyntaxError: invalid syntax

Fixes: a1983223d8 ("mesa: Add _mesa_format_fallback_rgbx_to_rgba() [v2]")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-07-05 12:48:26 -07:00
Bas Nieuwenhuizen
860a8e6b99 ac/nir: Move VS position exports before param exports.
According to Nicolai the SX can already start work when all
the position exports are done, so do those first.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-05 20:23:00 +02:00
Bas Nieuwenhuizen
3d527ba19b radv: Always set depthbuffer using image format instead of iview format.
We have some cases where changing between depth and stencil only aspect
was causing hangs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-05 20:23:00 +02:00
Bas Nieuwenhuizen
7c7196e35c radv: Disable depth & stencil tests when the depthbuffer doesn't support it.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-05 20:23:00 +02:00
Tomasz Figa
0ede0f9dff egl: android: Fix potential use of unitialized variable
If dri2_setup_extensions() fails, the "err" variable would not be assigned
causing the error path to access an unitialized variable. Fix it by
assigning an error message.

Fixes: 2c341f2bda ("egl: refactor dri2_create_screen() into three separate functions")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 18:49:22 +01:00
Tomasz Figa
50a8a7377a intel: common: Fix link failure with standalone Android build
Some reshuffle in the Makefiles under src/intel resulted in Android
libraries being no longer linked with code using
src/intel/common/gen_debug.h that contains references to functions
exported by those libraries (namely ALOGW macro, which is currently
resolved into a call to __android_log_print() from cutils).

Fix the build by taking into account ANDROID_CFLAGS and ANDROID_LIBS for
affected module on Android NDK builds.

Fixes: d5b355ce5f ("i965: Move intel_debug.h to intel/common/gen_debug.h")
Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 18:49:21 +01:00
Mauro Rossi
b7ee56b599 Android: generate symlinks for all enabled gallium drivers
Current post install command relies on GALLIUM_TARGET_DRIVERS variable,
however variable needs to be initialized in src/gallium/Android.mk
in order that all enabled gallium drivers symlinks are correctly generated.

At the moment due to sorting of INC_DIRS and variable set with svga (vmwgfx)
only vmwgfx_dri.so and virtio_gpu_dri.so symlinks are generated.

Fixes: a3d98ca62f ("Android: use symlinks for driver loading")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 15:10:42 +01:00
Tomeu Vizoso
79827f50e2 android: build imx-drm winsys
Add Android.mk for winsys/imx/drm.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 15:10:31 +01:00
Rob Herring
77c446711b android: add etnaviv driver build support
Add etnaviv to Android makefiles.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-07-05 15:10:31 +01:00
Nicolai Hähnle
c2065ed687 st/glsl_to_nir: fix edgeflag passthrough
We have to mark the additional shader input as used, otherwise it will
be eliminated, and we have to setup its index correctly.

This is a bit of a hack, but so is everything surrounding edgeflag
passthrough.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-05 12:27:12 +02:00
Nicolai Hähnle
8a4cd79d00 st/mesa: use pipe_shader_type_from_mesa
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-05 12:27:12 +02:00
Nicolai Hähnle
c7ecbd1153 tgsi_from_mesa: add tgsi_get_gl_frag_result_semantic
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-05 12:27:11 +02:00
Nicolai Hähnle
fb1c4e3d47 tgsi_from_mesa: add pipe_shader_type_from_mesa
So... the pipe_ prefix doesn't really fit into a TGSI header; on the
other hand, the return type has the pipe_ prefix.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-05 12:27:11 +02:00
Nicolai Hähnle
497b95fdf6 tgsi,st/mesa: move varying slot to semantic mapping into a helper for VS
We will use this helper in radeonsi's NIR path.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-05 12:27:11 +02:00
Nicolai Hähnle
d91f97f91d ddebug: handle some cases of non-TGSI shaders
NIR shaders are not captured properly in pipelined mode currently. This
would require shader cloning, which requires linking all the Gallium
drivers against NIR. We can always do that later.

v2: avoid immediate crashes in pipelined mode

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-07-05 12:27:11 +02:00
Nicolai Hähnle
10e1d2d9aa glsl_to_nir: zero-initialize var->data.descriptor_set
This is convenient for backends that support both Vulkan and OpenGL while
lowering samplers to derefs with nir_lower_samplers_as_deref.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-05 12:27:10 +02:00
Nicolai Hähnle
9a81d032c1 glsl: add glsl_base_type_is_integer
We will use this from radeonsi/nir, which we want to keep as pure C code.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-07-05 12:27:10 +02:00
Nicolai Hähnle
34df9525f6 nir: add NIR_PRINT environment variable
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-07-05 12:27:07 +02:00
Nicolai Hähnle
3628efedf2 glsl/blob: add valgrind checks that written data is defined
Undefined data will eventually trigger a valgrind error while computing
its CRC32 while writing it into the disk cache, but at that point, it is
basically impossible to track down where the undefined data came from.

With this change, finding the origin of undefined data becomes easy.

v2: remove duplicate VALGRIND_CFLAGS (Emil)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-05 12:26:10 +02:00
Nicolai Hähnle
210ebd4b9c glsl: explicitly zero out padding to gl_shader_variable bitfield
Otherwise, the padding bits remain undefined, which leads to valgrind
errors when storing the gl_shader_variable in the disk cache.

v2: use rzalloc instead of an explicit padding member variable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-05 12:25:49 +02:00
Nicolai Hähnle
f4f7096c1d glsl: simplify add_uniform_to_shader::visit_field
Each field gets a distinct name, so we should never hit the case where
the name already exists in the parameter list.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-05 12:25:49 +02:00
Nicolai Hähnle
727e8ba133 glsl: look for multiple variables simultaneously with find_assignment_visitor
Save some passes over the IR.

v2: redesign to make the users of find_assignments more readable
v3:
- fix missing !
- add some comments and make the num_found check more explicit (Timothy)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-05 12:25:21 +02:00
Marek Olšák
a2b02c4948 gallium/radeon: fix VDPAU breakage, need VRAM with WC 2017-07-05 01:14:48 +02:00
Ilia Mirkin
1e73fc6b1a a5xx: enable polygon offset clamps
This is already set and emitted by the code.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
def1b94c33 a5xx: implement logicop support
The former 0x60 hardcoded in is equivalent to ROP_COPY with the shift.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
abe8740e33 a5xx: enable polygon mode selection
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
8108b56023 a5xx: disable ARB_depth_clamp for now
We need to figure out how to implement it properly. Right now it doesn't
work at all.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
5d9d1df183 a5xx: fix clip_halfz support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
02379b68f6 a5xx: improve 3d texture sampling
At least the first level works now. Eventually the later levels stop
working, there appears to be some alignment issue. But this improves the
situation immensely.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
c0f1efe04d a5xx: remove one of the MIPFILTER_LINEAR bits
It doesn't appear to do what we want. Removing this bit makes
lodclamp-between as well as a number of dEQP tests pass, with no visible
ill effect.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
f1fc619bd8 a5xx: enable formats newly added to the headers
This enables S3TC, BPTC, ETC2, and ASTC texture decoding. Additionally
this enables RGB32 texture buffer objects, as well as 11_11_10_FLOAT and
10_10_10_2 vertex formats (and related extensions).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
b68e22d5e2 a5xx: include color swap when decoding vertices
This fixes support for BGRA vertex formats

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Ilia Mirkin
5fdcddbeb4 a5xx: update headers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
2017-07-04 18:27:57 -04:00
Marek Olšák
156832ee2b gallium/radeon: attempt to fix a compiler failure in radeon_winsys.h
trivial.
2017-07-04 22:40:35 +02:00
Marek Olšák
0591df025b winsys/amdgpu: use 128KB BOs for suballocations of up to 64KB BOs
This decreases the number of BOs, but might also increase memory usage.
It's better for small textures.

The gameplay is on the far right:
https://people.freedesktop.org/~mareko/suballoc.svg

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
c784015643 gallium/radeon: allow suballocating textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
23446eedd1 gallium/radeon: generalize the function for in-place texture reallocation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
91f72975ac gallium/radeon: add radeon_winsys::buffer_is_suballocated
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
0f13451da3 gallium/radeon: clean up pb_cache bucket/usage determination
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
d4fac1e1d7 gallium/radeon: enable suballocations for VRAM with no CPU access
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
64e5577cac gallium/radeon: clean up (domain, flags) <-> (slab heap) translations
This is cleaner, and we are down to 4 slabs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
b09a22ad21 gallium/radeon: remove RADEON_FLAG_CPU_ACCESS
https://lists.freedesktop.org/archives/amd-gfx/2017-June/010591.html

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
03c5ef195d gallium/radeon: disallow exports of sparse and suballocated BOs
I think it's unsafe, because the slabs can reuse exported storage.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
047c34f0ac gallium/radeon: clean up r600_texture_get_handle
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
7525c3e123 gallium/radeon: rename RADEON_FLAG_HANDLE -> RADEON_FLAG_NO_SUBALLOC
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
e6dbe975ef gallium/radeon: fix a possible crash for buffer exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
fee2883bd7 gallium/radeon: ignore PIPE_BIND_SHARED for buffers
BO exports can't be predicted this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Marek Olšák
5b373629fc radeonsi: add a HUD query for getting an average GFX BO list size
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-04 15:40:37 +02:00
Philipp Zabel
7d7bcd65d6 st/mesa: release EGLImage on EGLImageTarget* error
The smapi->get_egl_image() call in st_egl_image_get_surface() stores a
reference to the EGLImage's texture in stimg.texture. That reference is
released via pipe_resource_reference(&stimg.texture, NULL) before stimg
goes out of scope at the end of the function, but not in the error path
if !is_format_supported().

Fixes: 83e9de25f3 ("st/mesa: EGLImageTarget* error handling")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-04 11:05:40 +02:00
Juan A. Suarez Romero
2c240a7205 vc4: automake: include vc4_cl_dump.h in
Ensure vc4_cl_dump.h and $(BROADCOM_FILES) are distributed in the
dist-file.

This fixes `make distcheck`

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-04 09:37:19 +02:00
Marek Olšák
11a924c174 st/mesa: fix tessellation shaders with no support for shareable shaders
Broken by: b43c887a9b

Reported by Gert Wollny.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-07-03 23:08:28 +02:00
Dave Airlie
1bc40ae952 radv: enable Int64 capability (v2)
I'm not 100% sure this is all wired up but it looks like it is.

v2: actually enable extension.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-03 11:58:59 -07:00
Connor Abbott
2ec77f7a3c ac/nir: fix 64-bit shifts
NIR always makes the shift amount 32 bits, but LLVM asserts if the two
sources aren't the same type. Zero-extend the shift amount to make LLVM
happy.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-03 11:58:59 -07:00
Connor Abbott
7168425dd7 ac/nir: implement 64-bit packing and unpacking
We implement the split opcodes, and tell NIR to lower the original ones.
The lowering to LLVM is a little more complicated, but NIR can optimize
the split ones a little better, and some NIR lowering passes that we
might want to use (particularly for doubles) emit the split ones.

This should fix pack/unpackDouble2x32, which seems like a bug since when
we enabled the Float64 capability. It will also fix pack/unpackInt2x32
when we enable the Int64 capability.

Fixes: 798ae37c ("radv: Enable Float64 support.")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-07-03 11:58:58 -07:00
Connor Abbott
196e6b60b1 spirv: fix OpBitcast when the src and dst bitsize are different (v3)
Before, we were just implementing it with a move, which is incorrect
when the source and destination have different bitsizes. To implement
it properly, we need to use the 64-bit pack/unpack opcodes. Since
glslang uses OpBitcast to implement packInt2x32 and unpackInt2x32, this
should fix them on anv (and radv once we enable the int64 capability).

v2: make supporting non-32/64 bit easier (Jason)
v3: add another assert (Jason)

Fixes: b3135c3c ("anv: Advertise shaderInt64 on Broadwell and above")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-03 11:58:50 -07:00
Brian Paul
6158c0b5d8 svga: don't call svga_texture_device_format_has_alpha() for PIPE_BUFFER
svga_texture_device_format_has_alpha() is only intended to work for
texture resources, not buffer resources.  This fixes a failed assertion
in the svga_texture() cast function when running texture buffer tests.

Also, add an assertion in svga_texture_device_format_has_alpha() to
catch the issue sooner.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:14 -06:00
Brian Paul
e6d1cc31fa svga: fix texture buffer object regression
With change 8aba778fa2 we stopped binding
sampler objects for texture buffers.  That broke our texture sample /
sampler view setup code.

Now, we loop over the max(num samplers, num sampler views) and handle
the sampler and view information separately.  For texture buffers,
the sampler will be NULL but the sampler view non-null.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:13 -06:00
Brian Paul
6b4bf7e8be svga: move assertion in draw_vgpu10()
The buffer binding flags aren't ensured until after the
svga_buffer_handle() call, so move the assertion after it.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:13 -06:00
Brian Paul
9bd047aa26 svga: fix buffer binding flags initialization
If a buffer is created/initialized with glNamedBufferData we will
have no target (GL_ARRAY_BUFFER, GL_UNIFORM_BUFFER, etc) so the
svga_buffer::bind_flags will be zero until we try to get the buffer
handle.

This patch initializes the svga_buffer::bind_flags field when it's
zero.

This fixes the Piglit arb_uniform_buffer_object-rendering-dsa test.

Note that there's still issues in this area that'll have to be
addressed in the future.  For example, creating a buffer object
as GL_UNIFORM_BUFFER and later using it as a vertex buffer will
fail.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-07-03 10:10:11 -06:00
Brian Paul
8e4559b3fc docs: update bug reporting guidelines
Suggest attaching output of glxinfo/wglinfo.  Suggest providing
an apitrace.
2017-07-03 08:14:08 -06:00
Nicolai Hähnle
2f89c39861 st/mesa: remove an obsolete comment
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-03 13:57:01 +02:00
Nicolai Hähnle
7c5b204e38 mesa: remove unused parameter/member of add_uniform_to_shader
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-03 13:57:01 +02:00
Nicolai Hähnle
8988571824 util/disk_cache: fix a comment
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-03 13:57:01 +02:00
Nicolai Hähnle
da506cce8a glsl: simplify disable_varying_optimizations_for_sso
We always have stage == first and stage == last when first == last, so
drop the special case. Also rephrase the comment to make the logic
clearer.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-03 13:54:20 +02:00
Nicolai Hähnle
141d0831ff glsl: always print non-zero var->data.location_frac
This is helpful in debugging varying assignments.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-07-03 13:54:06 +02:00
Nicolai Hähnle
b0b4b5e8f7 winsys/radeon: only call pb_slabs_reclaim when slabs are actually used
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100242
Fixes: fb827c055c ("winsys/radeon: enable buffer allocation from slabs")
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-03 12:39:41 +02:00
Samuel Iglesias Gonsálvez
5dd96b1156 anv: check support for enabled features in vkCreateDevice()
From Vulkan spec, 4.2.1. "Device Creation":

  "vkCreateDevice verifies that extensions and features requested in
   the ppEnabledExtensionNames and pEnabledFeatures members of
   pCreateInfo, respectively, are supported by the implementation."

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@gmail.com>
2017-07-03 08:01:31 +02:00
Samuel Iglesias Gonsálvez
ba05f6f72b anv: merge tessellation's primitive mode in merge_tess_info()
SPIR-V tessellation shaders that were created from HLSL will have
the primitive generation domain set in tessellation control shader
(hull shader in HLSL) instead of the tessellation evaluation shader.

v2:
- Add assert (Kenneth)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-03 08:00:43 +02:00
Bruce Cherniak
32c1a54bd0 swr: Limit memory held by defer deleted resources.
This patch limits the number of items on the fence work queue (the
deferred deletion list) by submitting a sync fence when the queue size
exceeds a threshold.  This initiates deferred deletion of all resources
on the list and decreases the total amount of memory held waiting for
"deferred deletion".

This resolves  bug 101467 filed against swr for the piglit
streaming-texture-leak test.  For those running on smaller memory
(16GB?) systems, this will prevent oom-killer.

Thus far, we have not seen any real world applications that exhibit
behavior like the streaming-texture-leak test; as any form of pipeline
flush will trigger the defer queue and properly free any retained
allocations.  But, this addresses those as well.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-07-02 17:38:57 -05:00
Lionel Landwerlin
038c45a40e anv: fix reported timestampPeriod value
We lost some precision on a previous change due to switching to
integers. Since we report a float in timestampPeriod, we want the
division to happen in floats.

CID: 1413021
Fixes: c77d98ef32 ("intel: common: express timestamps units in frequency")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-02 12:11:55 +01:00
Lionel Landwerlin
34560ba9e5 intel: genxml: make a couple of enums show up in aubinator
In particular Shader Channel Select & Texture Address Control Mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-07-02 00:45:38 +01:00
Kenneth Graunke
df1279e9df i965: Print access flags in INTEL_DEBUG=buf output.
Being able to see the access mode of various mappings is incredibly
useful for debugging.  With this patch, INTEL_DEBUG=buf now shows
data such as:

   bo_create: buf 7 (bufferobj) 640b
   bo_map_gtt: 7 (bufferobj) -> 0x7fca1fae5000, WRITE ASYNC
   brw_bo_map_cpu: 7 (bufferobj) -> 0x7fca1fae4000, READ
   bo_map_gtt: 5 (bufferobj) -> 0x7fca1fad4000, WRITE ASYNC
   brw_bo_map_cpu: 7 (bufferobj) -> 0x7fca1fae4000, READ

which makes it easy to see that there are async GTT writes with
intervening CPU reads.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-07-01 11:48:08 -07:00
Chris Wilson
d8382d6889 i965: Remove clearing of bo->map_gtt after failure
With the conversion to storing the result of drm_mmap to a local and not
directly to bo->map_gtt itself, we no longer should clear bo->map_gtt.
In the best the operation is redundant as we know bo->map_gtt to already
be NULL, but in the worst case we overwrite a concurrent thread that
successfully mmaped the GTT.

Fixes: 314647c4c2 ("i965: Drop global bufmgr lock from brw_bo_map_* functions.")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-07-01 11:46:34 -07:00
Kenneth Graunke
f78aa2c986 i965: Add inline to brw_bo_unmap
I meant to do this in "i965: Make brw_bo_unmap a static inline."
but botched the commit fixup.
2017-06-30 20:35:14 -07:00
Chris Wilson
314647c4c2 i965: Drop global bufmgr lock from brw_bo_map_* functions.
After removing the unusuable debugging code in the previous commit, we
can also entirely remove the global mutex around mapping the buffer for
the first time and replace it with a single atomic operation to update
the cache once we retrieve the mmap.

v2 (Ken): Split out from Chris's original commit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-30 15:54:57 -07:00
Kenneth Graunke
bca92849b9 i965: Make brw_bo_unmap a static inline.
With the broken debugging code gone, it doesn't do anything anymore.
We could technically eliminate it, but I'd like to keep it around in
case we want to add something there again someday.  Otherwise we'd
have to go all over the codebase adding unmap calls back again.

Based on a patch by Chris Wilson.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-30 15:54:54 -07:00
Chris Wilson
c913241458 i965: Discard bo->map_count
Supposedly we were keeping a reference count for the number of users of
a mapping so that we could use valgrind to detect access to the map
outside of the valid section. However, we were incrementing the counter
only when first creating the cached mapping but decrementing on every
unmap. The bo->map_count tracking was wrong and so the debugging code
was completely useless.

v2 (Ken): Separate out atomic compare and swap optimization.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-30 15:54:52 -07:00
Kenneth Graunke
58d81d9dc2 i965: Add a comment about not needing VALGRIND_MALLOCLIKE_BLOCK.
At first glance this seems missing, since we handle it manually for CPU
and WC maps.  Although a bit inconsistent, it's actually not necessary.

Thanks to Chris Wilson for explaining this to me.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-30 15:54:44 -07:00
Bas Nieuwenhuizen
87d3349393 radv: Use v4i32 variant of llvm.SI.load.const.
We apparently still used v16i8 ....

As radeonsi doesn't use it with LLVM version checks I don't think
we need them either.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-30 23:30:55 +02:00
Brian Paul
f215f42f1b svga: add texture size/levels sanity check code in svga_texture_create()
The state tracker should never ask us to create a texture with invalid
dimensions / mipmap levels.  Do some assertions to check that.

No Piglit regressions.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
95d5c48f68 st/mesa: fix texture image resource selection in st_render_texture()
If we're rendering to an incomplete/inconsistent (cube) texture, the
different faces/levels of the texture may be stored in different
resources.  Before, we always used the texture object resource.  Now,
we use the texture image resource.  In normal circumstances, that's
the same resource.  But in some cases, such as the Piglit
fbo-incomplete-texture-03 test, the cube faces are in different
resources and we need to render to the texture image resource.

Fixes fbo-incomplete-texture-03 with VMware driver.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
f4d5e55dd1 st/mesa: check for incomplete texture in st_finalize_texture()
Return early from st_finalize_texture() if we have an incomplete
texture.  This avoids trying to create a texture resource with invalid
parameters (too many mipmap levels given the base dimension).

Specifically, the Piglit fbo-incomplete-texture-03 test winds up
calling pipe_screen::resource_create() with width0=32, height0=32 and
last_level=6 because the first five cube faces are 32x32 but the sixth
face is 64x64.  Some drivers handle this, but others (like VMware svga)
do not (generates device errors).

Note that this code is on the path that's usually not taken (we normally
build consistent textures).

No Piglit regressions.

v2: only need to check for base-level completeness since that's what has to
be consistent in order to specify the dimensions for a new gallium texture.
Per Roland.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
e54fe78e0e gallium/docs: document that TXF is used with PIPE_BUFFER resources
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Brian Paul
f4091e1638 gallium/docs: clarify that samplers are not used with PIPE_BUFFER resources
Commit 8aba778fa2 "st/mesa: don't set
sampler states for TBOs" changed how texture buffer objects are handled.
Document the new convention.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-30 13:37:10 -06:00
Eric Anholt
d623040dd5 vc4: Start using XML unpack functions in CL dump.
For now this is a no-op on the output, but it makes it clear that we've
had weird things going on with things like
V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
2017-06-30 12:25:45 -07:00
Eric Anholt
56541d356d vc4: Replace a couple of magic numbers with #define usage. 2017-06-30 12:25:45 -07:00
Eric Anholt
f6c5c6b9be vc4: Move rasterizer state packing to CSO creation time.
This gets our vc4_emit.c size back down a bit:

before:
   1020       0       0    1020     3fc src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
    968	      0	      0	    968	    3c8	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
bd1925562a vc4: Convert the driver to emitting the shader record using pack macros. 2017-06-30 12:25:45 -07:00
Eric Anholt
8d36bd3d08 vc4: Simplify pack header usage
Take the CL pointer in, which will be useful for enabling relocs.
However, our code expands a bit more:

before:
   4449       0       0    4449    1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
    988       0       0     988     3dc src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
   4481	      0	      0	   4481	   1181	src/gallium/drivers/vc4/.libs/vc4_draw.o
   1020	      0	      0	   1020	    3fc	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
4cef255872 vc4: Start using the pack header.
This slightly inflates the size of the generated code, in exchange for
getting us some convenient tools.

before:
   4389	      0	      0	   4389	   1125	src/gallium/drivers/vc4/.libs/vc4_draw.o
    808	      0	      0	    808	    328	src/gallium/drivers/vc4/.libs/vc4_emit.o

after:
   4449	      0	      0	   4449	   1161	src/gallium/drivers/vc4/.libs/vc4_draw.o
    988	      0	      0	    988	    3dc	src/gallium/drivers/vc4/.libs/vc4_emit.o
2017-06-30 12:25:45 -07:00
Eric Anholt
7f80a9ff13 vc4: Introduce XML-based packet header generation like Intel's.
I really liked this idea, as it should help with management of packet
parsing tools like the CL dump.  The python script is forked off of theirs
because our packets are byte-based instead of dwords, and the changes to
do so while avoiding performance regressions due to unaligned accesses
were quite invasive.

v2: Fix Android.mk paths, drop shebang for python script, fix overlap
    detection.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Rob Herring <robh@kernel.org>
2017-06-30 12:25:45 -07:00
Bruce Cherniak
6646f6ba0d swr: Minor cleanup of variable usage, no functional change.
In swr_update_derived, for consistency, index buffer validation should
be using the p_draw_info copy "info" rather than referencing
p_draw_info.

No functional change.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
b9b53e2695 swr: use swr_query_result type instead of void
Tag pStat field in swr_draw_context structure so gen_llvm_types.py
can deal with the actual structure type instead of using void.

Code cleanup, no functional change.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
80bd5cd9d0 swr/rast: increase number of possible draws in flight
Increases performance of some large workloads on KNL by ~30%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
ab564c7ab4 swr/rast: move default split size from driver to rasterizer
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
64af92c977 swr/rast: Fix missing setup of psContext.pColorBuffer
Fixes render target read access from pixel shaders.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
fc4f6c44c4 swr/rast: Switch intrinsic usage to SIMDLib
Switch from a macro-based simd intrinsics layer to a more C++
implementation, which also adds AVX512 optimizations to 128-bit
and 256-bit SIMD.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
8b66d18a3b scons: allow .inl file extension
Intended for header files which are not meant to be included directly.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
614de92f10 swr/rast: Fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
0cc7c46cf4 swr/rast: Split rasterizer.cpp to improve compile time
Hardcode split to four files currently.  Decreases swr build
time on KNL by over 50%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
5eecaca911 swr/rast: gen_backends.py remove extraneous semicolon
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
f87ff64850 swr/rast: Support dynamically sized vertex layout
Each shader stage state (VS, TS, GS, SO, BE/CLIP) now has a
vertexAttribOffset to specify the offset to the start of the
general attribute section of the incoming verts for that stage.
It is up to the driver to set this up correctly based on the
active stages. All the shader stages use this value instead of
VERTEX_ATTRIB_START_SLOT to offset to the incoming attributes.

Only the vertex shader stage supports dynamic layout output
currently. The other stages continue to expect the output to be
the fixed layout slots as before. Will be enabling GS next.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
cae53b24d7 swr/rast: Split backend.cpp to improve compile time
Hardcode split to four files currently.  Decreases swr build
time on a quad-core by ~10%.

Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
b89bd3694c swr/rast: gen_backends.py removal of commented debug prints
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
248663f91d swr/rast: gen_backends.py quote cleanup
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Tim Rowley
ba64ddedc2 swr/rast: generators will create target directories
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
2017-06-30 13:26:19 -05:00
Andres Gomez
7d0f80b1d3 docs: update calendar, add news item and link release notes for 17.1.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-06-30 20:38:01 +03:00
Andres Gomez
ed587f7868 docs: add sha256 checksums for 17.1.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 5a24aa8c55)
2017-06-30 20:34:32 +03:00
Andres Gomez
158fb2ef20 docs: add release notes for 17.1.4
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit e60d010ef4)
2017-06-30 20:34:32 +03:00
Mauro Rossi
84690d06c1 Android: fix typo in symlink for driver loading and 32 bit builds
There is typo in the mkdir command path,
the correct one is $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)

The other issue is in 32bit builds, because lib64 does not exist there,
we can use TARGET_IS_64_BIT to refine the post install command.

Fixes: a3d98ca62f ("Android: use symlinks for driver loading")

Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-30 11:23:51 -05:00
Brian Paul
0782350b80 svga: update a few surface format names
To sync with in-house changes.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-30 08:24:27 -06:00
Brian Paul
d3cbe8c5f3 svga: whitespace fixes in svga_resource_buffer_upload.c
Trivial.
2017-06-30 08:24:27 -06:00
Charmaine Lee
5e5d5f1a2e svga: add mksstats for surface view emulation
Add mksstats for surface view emulation and also tighten the stat
CreateBackedView for the actual creation of backed view.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-30 08:24:27 -06:00
Brian Paul
4f3974d758 svga: change error handling convention for svga_set_stream_output()
In general, the functions which emit commands to the command buffer check
for failure and return a PIPE_ERROR_x code.  It's up to the caller to
flush the buffer and retry the command.

But svga_set_stream_output() did its own flushing and the callers never
checked the return value (though, it would always be PIPE_OK) in practice.

This patch changes svga_set_stream_output() so that it does not call
svga_context_flush() when the buffer is full.  And we update the callers
to check the return value as we do for other functions, like
svga_set_shader().

No Piglit regressions.  Also tested w/ Nature demo.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-30 08:24:27 -06:00
Charmaine Lee
adead35320 svga: fixed surface size to include array size
This patch fixes the total surface size in surface cache
to include array size as well.

Tested with MTT glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-30 08:24:27 -06:00
Neha Bhende
31fe1d10b2 svga: loop over box.depth for ReadBack_image on each slice
piglit test ext_texture_array-gen-mipmap is fixed with this patch.

Tested with mtt piglit, glretrace, viewperf and conform. No regression.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-30 08:24:27 -06:00
Charmaine Lee
203d88460c svga: add mksstats for context creation
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-30 08:24:27 -06:00
Charmaine Lee
dbb5d2a790 svga: re-validate sampler view at draw time if needed
This patch validates those sampler views with backing copy
of texture whose original copy has been updated since the
view is last validated.
This is done here at draw time because the texture binding might not
have modified, hence validation is not triggered at state update time,
and yet the texture might have been updated in another context, so
we need to re-validate the sampler view in order to update the backing
copy of the updated texture.

This fixes a rendering flickering issue with Photoshop running in
Linux VM with HWversion 11. The problem is Photoshop renders to texture A
in context X, and then bind texture A to context Y. The first time
when texture A is bound to context Y, cso calls pipe->set_sampler_views().
Validation of sampler views is done, rendering is fine.
But when texture A is rendered to again in context X, and rebound in
context Y, cso skips pipe->set_sampler_views() because texture A is already
bound in context Y. SVGA driver is not given a chance to re-validate
the texture binding, the backing copy of the texture is not updated,
and hence causes black image.

Tested with Photoshop, MTT glretrace, piglit.
Fixes VMware bug 1769103.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-30 08:24:27 -06:00
Juan A. Suarez Romero
5355107034 automake: include git_sha1_gen.sh into EXTRA_DIST
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-30 16:15:54 +02:00
Nicolai Hähnle
7fd08177a6 Update Khronos-supplied headers
Taken from commit 676834dd529d620ee25090e738d2607dfde003d8
of https://github.com/KhronosGroup/OpenGL-Registry.git

v2:
- keep the BUILDING_MESA bits (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-30 15:29:50 +02:00
Johnson Lin
165e704719 i965/i915: Add UYVY as the supported format
Trigger the correct sampler options for it. Similar with YUYV

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Johnson Lin
8ff4be44b7 nir: Add a lowering pass for UYVY textures
Similar with support for YUYV but with byte order difference in sampler

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Johnson Lin
194205cd00 dri: Add UYVY as available format
UYVY is diffrent with YUYV in byte order.
YUYV is already declared in dri_interface.h,
this CL add the difinitions for UYVY.
Drivers can add UYVY as supported format

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-06-30 10:16:26 +01:00
Rob Herring
6335652899 gbm: add XBGR8888 support for dumb buffers
Add GBM_FORMAT_XBGR8888 format support which is needed for Android.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-30 08:44:19 +01:00
Rob Herring
cceb2d5c41 gallium: os_process fixes for Android
The function getprogname() is available on Android, since it reuses
various BSD solutions C runtime.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-30 08:44:19 +01:00
Tomeu Vizoso
2ecdedb8d4 etnaviv: Add unreachable statement to etna_amode to fix compilation warnings
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-30 08:44:16 +01:00
Bruce Cherniak
277621bbb7 swr: Remove need to allocate vertex buffer scratch space all in one go
Deferred deletion (via "fence_work") has obsoleted the need to allocate
all client vertex buffer scratch space in a single chunk.  Scratch
allocations are now valid until the referenced fence is complete.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-29 13:23:33 -05:00
Bruce Cherniak
2b27dcd075 swr: conditionally validate vertex buffer state
Vertex buffer state doesn't need to be validated on every call,
only on dirty _NEW_VERTEX or indexed draws.

Unconditional validation was introduced as part of patch 330d0607ed,
"remove pipe_index_buffer and set_index_buffer", with the expectation
we'd optimize later.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-29 13:23:33 -05:00
Tim Rowley
867e111769 swr: set dynamic vertex size
Reduces the memory footprint of the frontend processing by packing
vertices.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-29 13:23:33 -05:00
Eric Engestrom
9f6110ad32 scons: wait on subprocess' completion
Windows doesn't allow you to move a file that's opened, and Popen()
doesn't wait on its subprocess' completion before returning, which leads
to broken Windows build.

Fixes: 3fd425aed7 "build systems: uniformize git_sha1.h generation"
Suggested-by: Scott D Phillips <scott.d.phillips@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-29 17:38:26 +01:00
Eric Engestrom
3fd425aed7 build systems: uniformize git_sha1.h generation
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 16:24:58 +01:00
Marek Olšák
ccfac28835 radeonsi: set COMPUTE_DISPATCH_INITIATOR.ORDER_MODE = 1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-29 16:19:35 +02:00
Marek Olšák
af52e61935 radeonsi: use the DISPATCH packets to force COMPUTE_START_X/Y/Z = 0
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-29 16:19:35 +02:00
Rob Herring
a3d98ca62f Android: use symlinks for driver loading
Instead of having special driver loading logic for Android, create
symlinks to gallium_dri.so so we can use the standard loading logic.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-29 09:09:49 -05:00
Rob Herring
4aaa21d12e Android: i965: remove libdrm_intel dependency
Commit 7dd20bc3ee ("anv/i965: drop libdrm_intel dependency completely")
removed the libdrm_intel dependency for automake, but Android builds still
depended on it. Now the build requires a newer version of i915_drm.h and
fails on Android builds:

src/mesa/drivers/dri/i965/brw_performance_query.c:616:9: error: use of undeclared identifier 'I915_OA_FORMAT_A32u40_A4u32_B8_C8'
   case I915_OA_FORMAT_A32u40_A4u32_B8_C8:
        ^
src/mesa/drivers/dri/i965/brw_performance_query.c:1887:18: error: use of undeclared identifier 'I915_PARAM_SLICE_MASK'
      gp.param = I915_PARAM_SLICE_MASK;
                 ^
src/mesa/drivers/dri/i965/brw_performance_query.c:1893:18: error: use of undeclared identifier 'I915_PARAM_SUBSLICE_MASK'
      gp.param = I915_PARAM_SUBSLICE_MASK;
                 ^

Remove the libdrm_intel dependency for Android builds and add the necessary
include paths for the local copy of i915_drm.h.

Fixes: 7dd20bc ("anv/i965: drop libdrm_intel dependency completely")
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 12:33:27 +01:00
Mauro Rossi
b693fd8464 android: anv: drop libdrm_intel dependency
In addition to Rob Herring "Android: i965: remove libdrm_intel dependency",
we can drop libdrm_intel dependency in anv for Android.

Please check if libdrm has to stay as shared dependency and drop this comment line.

Fixes: 7dd20bc ("anv/i965: drop libdrm_intel dependency completely")
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-29 12:31:00 +01:00
Lucas Stach
4fb9f97047 etnaviv: fix memory leak when BO allocation fails
The resource struct is already allocated at this point and should be
freed properly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-29 11:34:50 +02:00
Lucas Stach
b2a87ce34f etnaviv: fill in layer_stride for imported resources
The layer stride information is used in various parts of the driver,
so it needs to be present regardless if the driver allocated the
buffer itself or merely imported it from an external source.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-29 11:34:24 +02:00
Lionel Landwerlin
d8bf2861ad anv: use devinfo for number of thread/eu
It turns out Gen9LP has fewer threads per EU (6 vs 7).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-06-29 10:07:52 +01:00
Juan A. Suarez Romero
93b8dc4b94 intel: tools: add intel_aub.h as part of aubinator
Include intel_aub.h in the Makefile.tools.am

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Juan A. Suarez Romero
be5fe2153b intel: automake: include Makefile.drm.am
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-29 10:03:40 +02:00
Kenneth Graunke
40f842ab57 mesa: Require mipmap completeness for glCopyImageSubData() at times.
This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.
This is apparently the de facto behavior and mandated by Android's CTS.

One exception is that we ignore format based completeness rules
(specifically integer formats with linear filtering), as this is
also the de facto behavior that until recently was mandated by the
OpenGL 4.5 CTS.

This was discussed with both the OpenGL and OpenGL ES working groups,
and while everyone agrees this behavior is unfortunate and complicated,
it is what it is at this point.  There was little appetite for relaxing
restrictions given that all conformant Android drivers followed the
mipmapping rule, and all conformant GL 4.5 implementations ignored the
integer/linear rule.

Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data

Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16224
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-28 22:29:41 -07:00
Timothy Arceri
6120fbc444 mesa: tidy up white space in pixelstore.c 2017-06-29 14:14:03 +10:00
Ian Romanick
e0acd62536 mesa: Refactor error checking for GL_TEXTURE_BASE_LEVEL vs texture targets
Add a big spec quotation justifying the error generated, which has
changed over the GL versions.

v2: Compact the spec quote based on a Khronos bug and discussion with Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2017-06-28 16:38:05 -07:00
Kenneth Graunke
d2c4f714d1 i965: Drop index buffer re-alignment code.
This shouldn't ever happen - GL requires it to be aligned:

   "Clients must align data elements consistent with the requirements
    of the client platform, with an additional base-level requirement
    that an offset within a buffer to a datum comprising N basic
    machine units be a multiple of N."

Mesa should reject unaligned index buffers for us - we shouldn't have
to handle them in the driver.

Note that Gallium already makes this assumption.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-28 16:21:44 -07:00
Timothy Arceri
c1b1cad586 mesa: add KHR_no_error support for glBlendFunc*()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
f21a764092 mesa: create some glBlendFunc*() helper functions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
87bc32166a mesa: add KHR_no_error support for glBindFragDataLocation*()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
aed0fc5efd mesa: add bind_frag_data_location() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
cb209dae99 mesa: add KHR_no_error support for glGetUniformLocation()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
cc88eb97e0 mesa: inline _mesa_finish()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
d8143a4bde mesa: add KHR_no_error support for glDisableVertexA*A*()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
73e0140acc mesa: move error handling into disable_vertex_array_attrib() callers
This will let us just call disable_vertex_array_attrib() for
KHR_no_error support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
d731b18933 mesa: add KHR_no_error support for glEnableVertexA*A*()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:11 +10:00
Timothy Arceri
8e77fceedb mesa: add KHR_no_error support for glLogicOp()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:10 +10:00
Timothy Arceri
ccbcb3ca17 mesa: add logic_op() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:10 +10:00
Timothy Arceri
774580c8b9 mesa: add KHR_no_error support for glPixelStore*()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:10 +10:00
Timothy Arceri
9853ca6037 mesa: add pixel_storei() helper
Will be used to add KHR_no_error support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:10 +10:00
Timothy Arceri
7d8937d23c mesa: remove redundant error check
We do the same check in the shared code in the set_tex_parameterf()
call.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-29 08:54:10 +10:00
Ian Romanick
2e3f40272e mesa: GL_TEXTURE_BORDER_COLOR exists in OpenGL 1.0, so don't depend on GL_ARB_texture_border_clamp
On NV20 (and probably also on earlier NV GPUs that lack
GL_ARB_texture_border_clamp) fixes the following piglit tests:

    gl-1.0-beginend-coverage gltexparameter[if]{v,}
    push-pop-texture-state
    texwrap 1d
    texwrap 1d proj
    texwrap 2d proj
    texwrap formats

All told, 49 more tests pass on NV20 (10de:0201).

No changes on Intel CI run or RV250 (1002:4c66).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-28 14:51:05 -07:00
Ian Romanick
36bd4a5f21 genxml: Silence about a billion unused parameter warnings
v2: Use textwrap.dedent to make the source line a lot shorter.
Shortening (?) the line was requested by Jason.

v3: Simplify the texwrap.dedent usage.  Suggested by Dylan.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-06-28 14:50:14 -07:00
Chad Versace
a56f0203c3 mesa: Fix Android build
The format_fallback.py script wants two arguments: 'csv-file' and
'out-file'.

Fixes: 20c99eaece "mesa: Add _mesa_format_fallback_rgbx_to_rgba() [v2]"
Reported-by: Rob Herring <robh@kernel.org>
2017-06-28 14:41:45 -07:00
Eero Tamminen
c35fd58688 i965: Fix anisotropic filtering for mag filter
Commit f8d69beed4 moving sampler
handling to genxml messed up change done by commit
6a7c5257ca.

This broke rendering in SynMark CSDof and TexFilterAniso tests.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101607

Thanks to Kevin, who spotted the actual typo!
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-28 13:33:28 -07:00
Lucas Stach
ec43605189 etnaviv: fix shader miscompilation with more than 16 labels
The labels array may change its virtual address on a reallocation, so
it is invalid to cache pointers into the array. Rather than using the
pointer directly, remember the array index.

Fixes miscompilation of shaders in glmark2 ideas, leading to GPU hangs.

Fixes: c9e8b49b (etnaviv: gallium driver for Vivante GPUs)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-28 22:04:30 +02:00
Dave Airlie
ff422500cc ac/nir: remove last remnants of v16i8
llvm doesn't need this workaround anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-28 20:22:30 +01:00
Alex Smith
909184ac9c ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers
The buffer intrinsics should be used instead of the image ones.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-28 21:05:04 +02:00
James Legg
69a17da037 ac/nir: assert printfs will fit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-28 21:05:04 +02:00
James Legg
6fc41bb4d5 ac/nir: Make intrinsic_name buffer long enough
When using cmpswap on an image, it was being trunctated to
lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely.

v2: Add stable CC

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-28 21:05:04 +02:00
Chad Versace
2cde8ff545 i965/dri: Support R8G8B8A8 and R8G8B8X8 configs
The Android framework requires support for EGLConfigs with
HAL_PIXEL_FORMAT_RGBX_8888 and HAL_PIXEL_FORMAT_RGBA_8888.

Even though all RGBX formats are disabled on gen9 by
brw_surface_formats.c, the new configs work correctly on Broxton thanks
to _mesa_format_fallback_rgbx_to_rgba().

On GLX, this creates no new configs, and therefore breaks no existing
apps. See in-patch comments for explanation. I tested with glxinfo and
glxgears on Skylake.

On Wayland, this also creates no new configs, and therfore breaks no
existing apps. (I tested with mesa-demos' eglinfo and es2gears_wayland
on Skylake). The reason differs from GLX, though. In
dri2_wl_add_configs_for_visual(), the format table contains only
B8G8R8X8, B8G8R8A8, and B5G6B5; and dri2_add_config() correctly matches
EGLConfig to format by inspecting channel masks.

On Android, in Chrome OS, I tested this on a Broxton device. I confirmed
that the Google Play Store's EGLSurface used HAL_PIXEL_FORMAT_RGBA_8888,
and that an Asteroid game's EGLSurface used HAL_PIXEL_FORMAT_RGBX_8888.
Both apps worked well. (Disclaimer: I didn't test this patch on Android
with Mesa master. I backported this patch series to an older Android
branch).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-28 09:42:27 -07:00
Juan A. Suarez Romero
89d4008ac8 mesa: do not use format string as literal string
This fixes a couple of  errors when building in Android:

external/mesa3d/src/mesa/main/shaderapi.c:293:49: error: format string
is not a string literal (potentially insecure)
[-Werror,-Wformat-security]
         _mesa_error(ctx, GL_INVALID_OPERATION, caller);
                                                ^~~~~~
external/mesa3d/src/mesa/main/shaderapi.c:293:49: note: treat the string
as an argument to avoid this
         _mesa_error(ctx, GL_INVALID_OPERATION, caller);
                                                ^
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-06-28 16:19:55 +02:00
Brian Paul
7bbcf3ac70 scons: add code to generate format_fallback.c file
Fixes: a1983223d8 "mesa: Add _mesa_format_fallback_rgbx_to_rgba() [v2]"

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-28 04:12:25 -06:00
Samuel Pitoiset
e529ade0ea mesa: add KHR_no_error support for glClear()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
34e8d0e4ba mesa: add clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
48400e0bd6 mesa: add KHR_no_error support for glBindAttribLocation()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
34e5b39f37 mesa: add bind_attrib_location() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
352adb53db mesa: add KHR_no_error support for gl*ReadBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
91fcba9914 mesa: create read_buffer_err() and always inline read_buffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
89bc3ed7a3 mesa: add KHR_no_error support for glVertex*AttribBinding()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
401fa69132 mesa: add KHR_no_error support for glShaderStorageBlockBinding()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
edd5082861 mesa: add shader_storage_block_binding() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
6a2c1e76f2 mesa: add KHR_no_error support for glUniformBlockBinding()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
277135c1ed mesa: add uniform_block_binding() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
f543107256 mesa: add KHR_no_error support for glFenceSync()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
dd71fd1dd3 mesa: add fence_sync() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
6e0cd29132 mesa: add KHR_no_error support for glClientWaitSync()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
20ff1f9db7 mesa: add client_wait_sync() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
78d3510f0c mesa: add KHR_no_error support for glCheckFramebufferStatus()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
b87a2cbec4 mesa: add KHR_no_error support for gl*Renderbuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
beb74c9b87 mesa: prepare create_render_buffers() for KHR_no_error support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
836b48a836 mesa: add KHR_no_error support for gl*ProgramPipelines()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
89510d26a9 mesa: prepare create_program_pipelines() for KHR_no_error support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:13 +02:00
Samuel Pitoiset
18e31bb252 mesa: add KHR_no_error support for gl*Samplers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
455b1a3a4b mesa: prepare create_samplers() helper for KHR_no_error support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
56f428817f mesa: add KHR_no_error support for gl*Textures()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
ab6d383e32 mesa: prepare create_textures() helper for KHR_no_error support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
821b806d23 mesa: fix an error message in create_textures()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
7c02267673 mesa: add KHR_no_error support for gl*Buffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
064bb7499c mesa: prepare create_buffers() helper for KHR_no_error support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
50f9f510c9 mesa: add KHR_no_error support for glBindTextureUnit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
81968cb748 mesa: add bind_texture_unit() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
3561d93668 mesa: add KHR_no_error support for glDepthRangeIndexed()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
9628282f1e mesa: add KHR_no_error support for glDepthFunc()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
bbc03839d1 mesa: add depth_func() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
eaa477104c mesa: add KHR_no_error support for glFrontFace()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
d77ad9da63 mesa: add front_face() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
d700ade81a mesa: add KHR_no_error support for glCullFace()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
ac92b75002 mesa: add cull_face() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
03dc92ad97 mesa: add KHR_no_error support for glCreateShader() and glCreateShaderObjectARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
c1782e44d0 mesa: rename create_shader() to create_shader_err()
And add a no_error variant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
868c9c244d mesa: pass the 'caller' function to create_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
8863996940 mesa: add KHR_no_error support for glAttachShader() and glAttachObjectARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
f04a5b4df0 mesa: rename attach_shader() to attach_shader_err()
And add a no_error variant.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Samuel Pitoiset
3ae7777d25 mesa: pass the 'caller' function to attach_shader()
In order to fix GL error messages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-28 10:25:12 +02:00
Ben Crocker
c26be2b2e9 mapi: Enable assembly language API acceleration for PPC64LE (V2)
Implement assembly language API acceleration for PPC64LE,
analogous to long-standing implementations for X86 and X86-64.

See also similar implementation in libglvnd.

Tested with Piglit.

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bill Schmidt <wschmidt@linux.vnet.ibm.com>
2017-06-28 08:20:45 +01:00
Chad Versace
74db56b97a i965: Add a RGBX->RGBA fallback for glEGLImageTextureTarget2D()
This enables support for importing RGBX8888 EGLImage textures on
Skylake.

Chrome OS needs support for RGBX8888 EGLImage textures because because
the Android framework produces HAL_PIXEL_FORMAT_RGBX8888 winsys
surfaces, which the Chrome OS compositor consumes as dma_bufs.  On
hardware for which RGBX is unsupported or disabled, normally core Mesa
provides the RGBX->RGBA fallback during glTexStorage.  But the DRIimage
code bypasses core Mesa, so we must do the fallback in i965.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 16:56:30 -07:00
Chad Versace
a1983223d8 mesa: Add _mesa_format_fallback_rgbx_to_rgba() [v2]
The new function takes a mesa_format and, if the format is an alpha
format with a non-alpha variant, returns the non-alpha format.
Otherwise, it returns the original format.

Example:
  input -> output

  // Fallback exists
  MESA_FORMAT_R8G8B8X8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
  MESA_FORMAT_RGBX_UNORM16 -> MESA_FORMAT_RGBA_UNORM16

  // No fallback
  MESA_FORMAT_R8G8B8A8_UNORM -> MESA_FORMAT_R8G8B8A8_UNORM
  MESA_FORMAT_Z_FLOAT32 -> MESA_FORMAT_Z_FLOAT32

i965 will use this for EGLImages and DRIimages.

v2 (Jason Ekstrand):
 - Use mako
 - Rework to be easier to read
 - Write directly to the output file

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 16:56:28 -07:00
Marek Olšák
4a10d6154e radeonsi: move instance divisors into a constant buffer
Shader key size: 107 -> 47

Divisors of 0 and 1 are encoded in the shader key. Greater instance divisors
are loaded from a constant buffer.

The shader code doing the division is huge. Is it something we need to
worry about? Does any app use instance divisors >= 2?

VS prolog disassembly:
    s_load_dwordx4 s[12:15], s[0:1], 0x80  ; C00A0300 00000080
    s_nop 0                                ; BF800000
    s_waitcnt lgkmcnt(0)                   ; BF8C007F
    s_buffer_load_dword s14, s[12:15], 0x4 ; C0220386 00000004
    s_waitcnt lgkmcnt(0)                   ; BF8C007F
    v_cvt_f32_u32_e32 v4, s14              ; 7E080C0E
    v_rcp_iflag_f32_e32 v4, v4             ; 7E084704
    v_mul_f32_e32 v4, 0x4f800000, v4       ; 0A0808FF 4F800000
    v_cvt_u32_f32_e32 v4, v4               ; 7E080F04
    v_mul_hi_u32 v5, v4, s14               ; D2860005 00001D04
    v_mul_lo_i32 v6, v4, s14               ; D2850006 00001D04
    v_cmp_eq_u32_e64 s[12:13], 0, v5       ; D0CA000C 00020A80
    v_sub_i32_e32 v5, vcc, 0, v6           ; 340A0C80
    v_cndmask_b32_e64 v5, v6, v5, s[12:13] ; D1000005 00320B06
    v_mul_hi_u32 v5, v5, v4                ; D2860005 00020905
    v_add_i32_e32 v6, vcc, v5, v4          ; 320C0905
    v_subrev_i32_e32 v4, vcc, v5, v4       ; 36080905
    v_cndmask_b32_e64 v4, v4, v6, s[12:13] ; D1000004 00320D04
    v_mul_hi_u32 v5, v4, v1                ; D2860005 00020304
    v_add_i32_e32 v4, vcc, s8, v0          ; 32080008
    v_mul_lo_i32 v6, v5, s14               ; D2850006 00001D05
    v_add_i32_e32 v7, vcc, 1, v5           ; 320E0A81
    v_cmp_ge_u32_e64 s[12:13], v1, v6      ; D0CE000C 00020D01
    v_sub_i32_e32 v6, vcc, v1, v6          ; 340C0D01
    v_cmp_le_u32_e32 vcc, s14, v6          ; 7D960C0E
    v_cndmask_b32_e64 v8, 0, -1, s[12:13]  ; D1000008 00318280
    v_cndmask_b32_e64 v6, 0, -1, vcc       ; D1000006 01A98280
    v_and_b32_e32 v6, v8, v6               ; 260C0D08
    v_cmp_eq_u32_e32 vcc, 0, v6            ; 7D940C80
    v_cndmask_b32_e32 v6, v7, v5, vcc      ; 000C0B07
    v_add_i32_e32 v5, vcc, -1, v5          ; 320A0AC1
    v_cmp_eq_u32_e32 vcc, 0, v8            ; 7D941080
    v_cndmask_b32_e32 v5, v6, v5, vcc      ; 000A0B06
    v_add_i32_e32 v5, vcc, s9, v5          ; 320A0A09

v2: set prefer_mono for fetched instance divisors

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 19:55:09 +02:00
Marek Olšák
aef998fe4b radeonsi: check nr_cbufs in other places before flushing CB
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 18:46:12 +02:00
Marek Olšák
f9a7e7fe14 radeonsi: use #pragma pack to pack si_shader_key
sizeof(struct si_shader_key):
  Before reverting the 2 commits: 120 bytes
  After reverting the 2 commits: 128 bytes
  With #pragma pack: 107 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 18:45:07 +02:00
Marek Olšák
77d2a98353 Revert "radeonsi: use uint32_t to declare si_shader_key.opt.kill_outputs"
This reverts commit 7b2240ac9c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 18:45:07 +02:00
Marek Olšák
dbe45e1180 Revert "radeonsi: remove 8 bytes from si_shader_key with uint32_t ff_tcs_inputs_to_copy"
This reverts commit 6b6fed3a3c.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 18:45:07 +02:00
Marek Olšák
984f7feeb4 mesa: optimize GL_PRIMITIVE_RESTART_NV more
And other client state changes don't have to call
update_derived_primitive_restart_state.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-27 18:45:07 +02:00
Marek Olšák
bcf5d5ce40 mesa: fix clip plane enable breakage
Broken by:

commit 00173d91b7
Author: Marek Olšák <marek.olsak@amd.com>
Date:   Sat Jun 10 12:09:43 2017 +0200

    mesa: don't flag _NEW_TRANSFORM for st/mesa if possible

It also optimizes the case slightly for GL core.

It doesn't try to fix that glEnable might be a bad place to do the
clip plane transformation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-27 18:45:07 +02:00
Leo Liu
fad0b47219 radeon/vcn: enable h264 decode entension support
It's enabled through message buffer for UVD

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-06-27 10:59:44 -04:00
Charmaine Lee
b2e78e79d7 svga: clean up format_cap_table
Per Jose's suggestion, this patch cleans up format_cap_table to remove
the unnecessary default cap value for vgpu10 formats since those devcap values
can be retrieved from the device.

Tested with MTT conform, glretrace, piglit in HWv13 and HWv8.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-27 07:49:03 -06:00
Charmaine Lee
122ca27a48 svga: fix the default devcap for SVGA3D_Z_D24S8_INT
The default devcap for format SVGA3D_Z_D24S8_INT in HWv8 when its devcap is
not explicitly advertised should be set to zero to match the default value
in the device.

Tested with MTT piglit in HW version 8.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-27 07:49:02 -06:00
Charmaine Lee
eea6223184 svga: create buffer surfaces for incompatible bind flags
In cases where certain bind flags cannot be enabled together,
such as CONSTANT_BUFFER cannot be combined with any other flags,
a separate host surface will be created.
For example, if a stream output buffer is reused as a constant buffer,
two host surfaces will be created, one for stream output,
and another one for constant buffer. Data will be copied from the
stream output surface to the constant buffer surface.

Fixes piglit test ext_transform_feedback-immediate-reuse-index-buffer,
                  ext_transform_feedback-immediate-reuse-uniform-buffer

Tested with MTT piglit, MTT glretrace, Nature, NobelClinician Viewer, Tropics.

v2: Fix bind flags compatibility check as suggested by Brian.
v3: Use the list utility to maintain the buffer surface list.
v4: Use the SAFE rev of LIST_FOR_EACH_ENTRY

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-27 07:49:02 -06:00
Charmaine Lee
7abfb0b0d5 svga: do not unconditionally enable streamout bind flag
Currently we unconditionally enable streamout bind flag at
buffer resource creation time. This is not necessary if the buffer
is never used as a streamout buffer. With this patch, we enable
streamout bind flag as indicated by the state tracker. If the buffer
is later bound to streamout and does not already has streamout bind
flag enabled, we will recreate the buffer with
the new set of bind flags. Buffer content will be copied
from the old buffer to the new one.

Tested with MTT piglit, Nature, Tropics, Lightsmark.

v2: Fix bind flags check as suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-27 07:49:02 -06:00
Charmaine Lee
b549f5e6b1 svga: pass tobind_flags to svga_buffer_handle
This is to prepare for more bind_flags optimization
in subsequent patches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-27 07:49:02 -06:00
Charmaine Lee
4a79b508a4 svga: pass bind_flags to surface create functions
This is to prepare for other bind_flags optimization
in subsequent patches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-27 07:49:02 -06:00
Brian Paul
ce608784d0 pipe_loader_sw: fix compilation warning
Add the new 'flags' parameter to pipe_loader_sw_create_screen().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-27 07:49:02 -06:00
Eric Engestrom
b3eda74acf mesa: add missing include
src/mesa/drivers/x11/xm_dd.c:688:7: warning: implicit declaration of function ‘_mesa_update_draw_buffer_bounds’; did you mean ‘_mesa_has_ARB_draw_buffers_blend’? [-Wimplicit-function-declaration]
       _mesa_update_draw_buffer_bounds(ctx, ctx->DrawBuffer);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cc: Marek Olšák <marek.olsak@amd.com>
Fixes: 585c5cf8a5 ("mesa: don't update draw buffer bounds in
			      _mesa_update_state")
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-27 14:33:49 +01:00
Lionel Landwerlin
3e0d54d270 i965: perf: add support for Geminilake
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:30 +03:00
Lionel Landwerlin
9a50fc7cfc i965: perf: add support for Kabylake
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:30 +03:00
Lionel Landwerlin
8ff086fa68 i965: perf: use gen_device_info rather then brw_context
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:30 +03:00
Robert Bragg
e277ff41c0 i965: perf: ensure isolated timer reports while idle don't confuse filtering
From experimentation in IGT, we found that the OA unit might label
some report as "idle" (using an invalid context ID), right after a
report for a given context. Deltas generated by those reports actually
belong to the previous context, even though they're not labelled as
such.

This change makes ensure that while reading OA reports, we only
consider the GPU actually idle after 2 reports with an invalid context
ID.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin
31b11f69f7 i965: perf: keep on reading reports until delimiting timestamp
Due to an underlying hardware race condition, we have no guarantee
that all the reports coming from the OA buffer related to the workload
we're trying to measure have landed to memory by the time all the work
submitted has completed. That means we need to keep on reading the OA
stream until we read a report with a timestamp more recent than the
timestamp recored by the MI_REPORT_PERF_COUNT at the end of the
performance query.

v2: fix uninitialized offset variable to 0 (Lionel)

v3: rework the reading to avoid blocking the user of the API unless
    requested (Rob)

v4: fix a bug that makes the i965 driver reading the perf stream when
    not necessary, leading to very long counter accumulation times
    (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Robert Bragg
1fc7b95127 i965: Add Gen8+ INTEL_performance_query support
Enables access to OA unit metrics on Gen8+ via INTEL_performance_query.

v2: make use of new parameters coming from gen_device_info (Lionel)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Robert Bragg
243909d41e i965: Add XML OA metric sets for Gen8+
Also updates Makefile.am to generate corresponding normalization code.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Robert Bragg
e74972a3a6 i965: Add Gen8+ sys_vars for generated OA code
In preparation for adding XML OA metric set descriptions for Gen 8 and 9
which will result in auto generated code that depends on a number of new
system variables ($EuSubslicesTotalCount, $EuThreadsCount and
$SliceMask) this adds corresponding members to brw->perf.sys_vars.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin
7dd20bc3ee anv/i965: drop libdrm_intel dependency completely
With Ken's work to drop the library dependency on libdrm_intel, we now
only depend on libdrm for the kernel uapi headers it provides. It
seems like we're better off just embeddeding those headers ourselves,
making the lives of people developping news features tightly
integrated with the kernel a tiny bit easier.

This change also makes it a bit more obvious what cflags/libs are
required by the i915 drivers vs i965, by renaming INTEL_CFLAGS/LIBS
into I915_CFLAGS/LIBS.

Headers were generated from drm-tip on the following commit :

   commit 6d61e70ccc21606ffb8a0a03bd3aba24f659502b
   Merge: 338ffbf7cb5e c0bc126f97fb
   Author: Dave Airlie <airlied@redhat.com>
   Date:   Tue Jun 27 07:24:49 2017 +1000

       Backmerge tag 'v4.12-rc7' into drm-next

v2: Use installed files from the kernel (Daniel Vetter)

v3: Use headers from drm-next rather than drm-tip (Dave/Daniel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin
3c50ebce25 i915: use different CFLAGS/LIBS variables than i965/anv
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-27 14:10:29 +03:00
Lionel Landwerlin
230691b8e5 aubinator: import intel_aub.h from libdrm
This enables us to compile aubinator without the libdrm dependency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:28 +03:00
Lionel Landwerlin
adafe4b733 i965: perf: minimize the chances to spread queries across batchbuffers
Counter related to timings will be sensitive to any delay introduced
by the software. In particular if our begin & end of performance
queries end up in different batches, time related counters will
exhibit biffer values caused by the time it takes for the kernel
driver to load new requests into the hardware.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-27 14:10:25 +03:00
Juan A. Suarez Romero
7ee409dd4e nir: implement GLSL.std.450 NMax, NMIn and NClamp operations
v2: NIR fmax/fmin already handles NaN (Connor).

Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-27 12:01:11 +02:00
Juan A. Suarez Romero
b5ae17fe59 nir: add support for 64-bit in SmoothStep function
According to GLSL.std.450 spec, SmoothStep expects input to be a
floating-point type, but it does not restrict the bitsize.

Current implementation relies on inputs to be 32-bit.

This commit extends the support to 64-bit size inputs.

Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-27 12:01:11 +02:00
Juan A. Suarez Romero
4195a9450b nir: sge operation is defined for floating-point types
According to GLSL.std.450 spec, the operand for step() function must be
a floating-point. It does not restrict the value to 32-bit floats.

Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-27 12:01:11 +02:00
Topi Pohjolainen
b3bf453686 i965: Separate gen < 8 and gen >= 8 paths explicitly in wrap_mode()
Makes coverity happier.

Fix indentation in gen >= 8 block while at it.

CID: 1413020
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-27 10:20:35 +03:00
Topi Pohjolainen
fbcc9555c5 intel/anv: Add missing break in anv_CreateDevice()
CID: 1413018
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-27 10:19:55 +03:00
Nicolai Hähnle
2ce126df3a ac/nir: convert emit helpers to ac_llvm_context
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
58d496c8e2 ac/nir: remove unused nir_to_llvm_context::has_ddxy
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
6ecef25545 ac/nir: implement nir_op_f2b
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
dacf73e527 ac/nir: implement nir_op_{b2i,i2b}
Booleans in NIR are ~0 for true, b2i returns 0/1.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
77d7764d5e ac/nir: convert type helpers to ac_llvm_context
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
b7bd49158e ac/llvm: fix type of second llvm.cttz.* parameter
LLVM has required an i1 here for a long time. llvm.ctlz.* was fixed in
commit edd23e0606 ("ac/llvm: fix various findMSB bugs").

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:30 +10:00
Nicolai Hähnle
e8ba03d32a ac/shader_info: fix a comment
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:29 +10:00
Nicolai Hähnle
edfd3be77e ac: add ac_llvm_context::v8i32
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:29 +10:00
Nicolai Hähnle
331a574732 ac: add ac_llvm_context::{i,f}32_{0,1}
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:29 +10:00
Nicolai Hähnle
7bf8c944dc ac: add ac_llvm_context::{i16, i64, f16, f64}
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-27 10:28:29 +10:00
Ilia Mirkin
4a79f2be33 nv50/ir: fix combineLd/St to update existing records as necessary
Previously the logic would decide that the record is kept, which
translates into keep = false in the caller, which meant that these
passes did not run.

While it's right that keep = false which means that a new record does
not need to be added, we do still have to perform the usual list
maintenance. It's easiest to do this pre-merge rather than post.

The lowering that clip/cull distance passes produce triggers this bug in
TCS (since reading outputs is done differently in other stages), but it
should be possible to achieve it with the right sequence of regular
reads/writes.

Fixes: KHR-GL45.cull_distance.functional
Fixes: generated_tests/spec/arb_tessellation_shader/execution/tes-input/tes-input-gl_ClipDistance.shader_test
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-06-26 20:24:19 -04:00
Ilia Mirkin
7d56ae5eb2 nv50/ir: adjust overlapping logic to take fileIndex-relative offsets
If the fileIndex is different, that means they are in logically
different spaces. However if there's also a relative offset, then they
could end up pointing at the same spot again.

Also add a note about potential for multiple buffers to overlap even if
they're at different file indexes. However that's potentially lowered
away by the point that this logic hits.

Not known to fix any specific application or test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-26 20:24:19 -04:00
Ilia Mirkin
55a8c11705 nv50/ir: VFETCH is also considered a load for MemoryOpt
This has no effect since in practice this will only play for
memory-backed files, for which VFETCH will never happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-26 20:24:19 -04:00
Ilia Mirkin
c12f8305a8 nv50,nvc0: remove IDX from bufctx immediately, to avoid conflicts with clear
The idxbuf could linger, and when a clear happened, which also uses the
3d bufctx, we could get an error trying to access it.

This fixes spurious crashes/errors in CTS tests.

Fixes: 61d8f3387d ("nv50,nvc0: clear index buffer bufctx bin unconditionally")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-06-26 20:23:04 -04:00
Ilia Mirkin
8c02ee4a8b nv50/ir: fetch indirect sources BEFORE the op that uses them
All the BuildUtil helpers just insert the operation into the current BB.
So we have to take care that any fetchSrc() operations happen before the
operation whose setIndirect() it goes into.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-06-26 20:22:46 -04:00
Timothy Arceri
9545139ce5 mesa: skip FLUSH_VERTICES() if no samplers were changed
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-27 09:20:29 +10:00
Timothy Arceri
191ff86d53 mesa: don't set _NEW_PROGRAM_CONSTANTS for non-bindless opaque uniforms
v2: rebase on new _mesa_flush_vertices_for_uniforms() helper

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-27 09:17:16 +10:00
Rob Herring
c4291a3283 Android: add renderonly files to libmesa_gallium
vc4 now depends on renderonly functions, but these weren't added to the
Android build resulting in the following errors:

src/gallium/drivers/vc4/vc4_resource.c:380: error: undefined reference to 'renderonly_scanout_destroy'
src/gallium/drivers/vc4/vc4_resource.c:681: error: undefined reference to 'renderonly_create_gpu_import_for_resource'
src/gallium/drivers/vc4/vc4_screen.c:625: error: undefined reference to 'renderonly_dup'
src/gallium/winsys/pl111/drm/pl111_drm_winsys.c:37: error: undefined reference to 'renderonly_create_gpu_import_for_resource'
src/gallium/winsys/pl111/drm/pl111_drm_winsys.c:37: error: undefined reference to 'renderonly_create_gpu_import_for_resource'

Fixes: 7029ec05e2 ("gallium: Add renderonly-based support for pl111+vc4.")
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-06-26 16:10:42 -07:00
Timothy Arceri
a00a277da9 mesa: add KHR_no_error support for glCopyTexImage*D()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:27:11 +10:00
Timothy Arceri
8bf02efed3 mesa: add no error support to copyteximage()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:27:11 +10:00
Timothy Arceri
167f6a33fa mesa: create copyteximage_err() helper and always inline copyteximage()
This will be useful in the following patch when we add KHR_no_error
support.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:27:11 +10:00
Timothy Arceri
8b9eccc061 mesa: tidy up copyteximage()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:27:11 +10:00
Ian Romanick
f73c63a175 i915: On Gen <= 3 there are no array textures
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
122e6dc451 i915: On Gen <= 3 there is no W-tiling
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
97d332ce0e i915: Remove unused fields intel_mipmap_tree::logical_(width|height|depth)0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
ca8e8d5520 i915: Remove unused field intel_mipmap_tree::array_spacing_lod0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
e5a632a256 i915: On Gen <= 3 there is no multisampling
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
7b7a0ba04c i915: Trivial code reformatting
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Ian Romanick
b08981009d i915,i965: Don't condition use of GLSL clear on the current API
Meta always sets the API to API_OPENGL_COMPAT, so the current API
setting is irrelevant.

   text	   data	    bss	    dec	    hex	filename
7154994	 256860	  37332	7449186	 71aa62	32-bit i965_dri.so before
7154978	 256860	  37332	7449170	 71aa52	32-bit i965_dri.so after
6788451	 328056	  50704	7167211	 6d5ceb	64-bit i965_dri.so before
6788419	 328056	  50704	7167179	 6d5ccb	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-26 15:20:09 -07:00
Timothy Arceri
7719f52d5f mesa: add KHR_no_error support for glCopyTex{ture}SubImage*D()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:15:09 +10:00
Timothy Arceri
b480211058 mesa: add copy_texture_sub_image_no_error() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:15:09 +10:00
Timothy Arceri
3034c4c725 mesa: remove redundant NULL check
This can never be NULL in any of the entry paths.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:15:09 +10:00
Timothy Arceri
c7f7a375d9 mesa: create copy_texture_sub_image_err() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:15:09 +10:00
Timothy Arceri
45498aff82 mesa: make _mesa_copy_texture_sub_image() static
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:15:09 +10:00
Timothy Arceri
bc0af44a5a mesa: add KHR_no_error support for gl{Compressed}TexImage*D()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:11:02 +10:00
Timothy Arceri
51f4ebdbdc mesa: add no error support to teximage()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:11:02 +10:00
Timothy Arceri
ca5f1e82de mesa: create wrapper around teximage()
This is used to inline KHR_no_error logic without inlining
the function into all its callers.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:11:02 +10:00
Timothy Arceri
62abf6862f mesa: fix unused variable warning in release builds
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-27 08:03:29 +10:00
Marek Olšák
ccf963ed29 radeonsi: don't flush and wait for CB after depth-only rendering
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-26 23:35:19 +02:00
Ian Romanick
1b101ca809 blorp: Use normalized coordinates on Gen6
Apparently, the sampler has some sort of precision issues for
non-normalized texture coordinates with linear filtering.  This caused
some small precision issues in scaled blits.  Work around this by using
normalized coordinates.  There is some extra work necessary because Gen6
uses TEX (instead of TXF) for some multisample resolve blits.

Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-06-26 13:41:11 -07:00
Marek Olšák
25ea7aa5cd mesa/glthread: don't include pthread.h
Not needed. This fixes the Windows build.
2017-06-26 22:23:31 +02:00
Nanley Chery
d6748f1fc4 anv/gpu_memcpy: Rename the gpu_memcpy function
A GPU memcpy function could alternatively be implemented using MI_*
commands. Provide more detail into how this one operates in case another
memcpy function is created.

v2:
- Update the commit message.
v3:
- Use 'memcpy' instead of 'cpy' (Jason Ekstrand)
- Shorten 'streamout' to 'so'

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
1415e7a997 anv/blorp: Provide surface states for CCS resolves
In the future, we plan on using this method to resolve images whose
surface state fast-clear value is dynamically updated during command
buffer execution. Start using it now for testing and to reduce churn
later on.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
4b2a2b70e0 anv/blorp: Add a surface-state-based CCS resolve function
This will be used in the next patch.

v2:
- Omit BLORP_BATCH_NO_EMIT_DEPTH_STENCIL (Jason Ekstrand)
- Update commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
d1119ab7b6 blorp/clear: Add a binding-table-based CCS resolve function
v2:
- Do layered resolves.
(Jason Ekstrand):
- Replace "bt" suffix with "attachment".
- Rename helper function to prepare_ccs_resolve.
- Move blorp_params_init() into helper function.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
6235f08ff8 anv: Adjust params of color buffer transitioning functions
Splitting out these fields will make the color buffer transitioning
function simpler when it gains more features.

v2: Remove unintended blank line (Iago Toral)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
e15b1c41a4 anv/blorp: Remove 3D subresource transition workaround
For 3D image subresources undergoing a layout transition via
PipelineBarrier, we increase the number of fast-cleared layers to match
the intended behaviour of KHR_maintenance1. When such subresources
undergo layout transitions between subpasses, we don't do this to avoid
failing incorrect CTS tests. Instead, unify the behaviour in both
scenarios, and wait for the CTS tests to catch up. See CL 1111 for the
test fix and Vulkan issue #849 for more information.

On SKL+, this causes 3 test failures under:
dEQP-VK.pipeline.render_to_image.3d.*

v2: Add a reference to the Vulkan issue (Iago Toral).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
5ca2fbcee2 anv/cmd_buffer: Adjust the image view reloc function
Make the function take in an image instead of an image view. This
enables us to record relocations for surfaces states created outside of
the anv_CreateImageView path.

v2 (Jason Ekstrand):
- Use image->offset instead of surf_offset in aux_offset calculation.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
5f4f50419c anv/cmd_buffer: Adjust layout transition aspect checking
Reflect the fact that an image view or subresource range with the color
aspect cannot have any other aspect.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
bc838fc759 anv: Add and use color auxiliary buffer helpers
v2:
- Check for aux levels in layer helper (Jason Ekstrand)
- Don't assert aux is present, return 0 if it isn't.
- Use the helpers.
v3:
- Make the helpers aspect-agnostic (Jason Ekstrand)
- Drop anv_image_has_color_aux()

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
8aaa13467d intel/isl: Only create a CCS buffer if the image supports rendering
v2: Omit the commit message.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
b934330191 intel/isl: Limit CCS to one level and layer on gen7
v2 (Jason Ekstrand):
- Remove Vulkan-specific terminology from the commit title.
- Replace '== 7' with '<= 7' to hint that this is a new feature on BDW+.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
6b23c65f3a intel/blorp: Check for layer fast-clear restriction
v2: Update commit title (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Nanley Chery
b46a071758 intel/blorp: Assert levels and layers are in range
v2 (Jason Ekstrand):
- Update commit title.
- Check aux level and layer as well.
v3 (Jason Ekstrand):
- Move the non-aux layer check.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-26 11:09:12 -07:00
Lucas Stach
28550c7875 etnaviv: only flush resource to self if no scanout buffer exists
Currently a resource flush may trigger a self resolve, even if a scanout buffer
exists, but is up to date. If a scanout buffer exists we only ever want to
flush the resource to the scanout buffer. This fixes a performance regression.

Fixes: dda956340c (etnaviv: resolve tile status when flushing resource)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-26 20:06:01 +02:00
Christian Gmeiner
d8b2ccdb88 etnaviv: add support for snorm textures
Based on a patch from Wladimir J. van der Laan and untested due
to lack of hardware. Binary blob emits those formats if GPU supports
HALTI1 (faked with ibvivhook).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-26 19:59:42 +02:00
Christian Gmeiner
3bbf8dcfe4 etnaviv: add R8G8 texture support
Passes texwrap GL_ARB_texture_rg piglit (with faked full texture rg support).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-26 19:56:59 +02:00
Christian Gmeiner
751ae6afbe etnaviv: add support for swizzled texture formats
Passes all ext_texture_swizzle piglits.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-26 19:56:39 +02:00
Christian Gmeiner
0ddcccac4f etnaviv: add support for extended texture formats
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-26 19:49:30 +02:00
Chad Versace
31c3c440b5 glapi: Fix -Wduplicate-decl-specifier due to double-const
Fix all lines in src/mesa/main/marshal_generated.c that declare
double-const variables. Below is all such lines, with duplicates
removed:

   $ grep 'const const' marshal_generated.c | sort -u
   const const GLboolean * pointer = cmd->pointer;
   const const GLvoid * indices = cmd->indices;
   const const GLvoid * pointer = cmd->pointer;

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 10:26:23 -07:00
Eric Engestrom
2b237ff64c anv: use Mesa's u_atomic.h header
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 18:21:22 +01:00
Eric Engestrom
a2ae2d1fb0 radv: use Mesa's u_atomic.h header
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-26 18:21:22 +01:00
Bruce Cherniak
6fafba0e67 swr: set an explicit clear_rect if scissor is not enabled.
Fix regression of "no rendering" on simple apps like glxgears by
setting an explicit full surface clear_rect when scissor is not
enabled.

This regressed with commit 00173d91 "st/mesa: don't set 16
scissors and 16 viewports if they're unused" due to an assumption
that a default scissor rect is always set, which was the case prior
to this optimization.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-26 11:30:08 -05:00
Tim Rowley
0e1e5a2b14 swr/rast: adjust std::string usage to fix build
Some combinations of c++ compilers and standard libraries had problems
with the string::replace code we were using previously.

This should fix the travis-ci system.

Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-26 11:29:27 -05:00
Eric Engestrom
3c7c82cef0 travis: add missing libs: xdamage + xfixes
> configure: error: Package requirements (x11 xext xdamage >= 1.1 xfixes
> x11-xcb xcb xcb-glx >= 1.8.1 xcb-dri2 >= 1.8) were not met:
> No package 'xdamage' found
> No package 'xfixes' found

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-26 15:44:37 +01:00
Nicolai Hähnle
f17d78becc radeonsi: support indirect indexing in INTERP_* opcodes
The hardware doesn't support it, so we just interpolate all array elements
and then use indirect indexing on the resulting vector.

Clearly, this is not very efficient. There is an argument to be had for
adding if/else, or perhaps even pulling the data out of LDS directly.
Both don't really seem worth the effort, considering that it seems nobody
actually uses this feature.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-26 14:02:06 +02:00
Ben Crocker
162c42f8ed egl_dri2: swrastGetDrawableInfo: set *x, *y [v2]
In swrastGetDrawableInfo, set *x and *y, not just *w and *h;
this fixes a crash later in drisw_update_tex_buffer when the
(formerly) uninitialized x and y values are used to construct
an address in a call to llvmpipe_transfer_map.

Fixes crash in Piglit test
"spec@egl 1.4@eglcreatepbuffersurface and then glclear"
(<piglit dir>/bin/egl-create-pbuffer-surface -auto)
that occurred intermittently, e.g. when the uninitialized x and y in
drisw_update_tex_buffer just happened to contain absurd non-zero values.

v2: Initialize in case if function succeeds or fails, just like *w/*h.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-26 12:48:19 +01:00
Emil Velikov
c58af5cbb2 egl: fold _eglError() + return EGL_FALSE
The function _eglError() already explicitly returns EGL_FALSE,
explicitly to simplify the callers. Make use of it.

While EGL_FALSE is numerically identical to false, NULL, EGL_NO_FOO,
storage is not the same so we cannot use it for "everything".

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-26 12:41:00 +01:00
Emil Velikov
d42b09580a egl: drop _eglInitImage() return type
Function cannot fail and always returns true.

v2: Inline the one line function in the header

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-26 12:40:22 +01:00
Juan A. Suarez Romero
860919a3b2 glsl: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
xfb only applies to the latest stage before the fragment shader, so
there is no need to invoke it in the fragment shader.

Fixes:
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list
KHR-GL45.enhanced_layouts.xfb_stride_of_empty_list_and_api

v2: do reset only if shaders provide an explicit stride

v3: do not call link_xfb_stride_layout_qualifiers() for fragment shaders
(Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-06-26 12:00:22 +02:00
Constantine Charlamov
abc7b110b6 r600g: fix crash when file in R600_TRACE doesn't exist
…and print error in such case. Which probably is not a rare event btw
because fopen doesn't expand ~ to $HOME.

Also get rid of unused "bool ret" variable.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 17:39:54 +10:00
Constantine Charlamov
3d466f3e9f r600g: take into account offset to system inputs at tgsi_interp_egcm()
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=100785

v2: I was too much twiddling whether to initialize nsys_inputs at the beginning of shader initialization or for allocation of system values, and by the time I decided to go with the first one, I forgot to change it back.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:32:36 +10:00
Constantine Charlamov
469e2ed473 r600g: get rid of trailing whitespace
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:30:10 +10:00
Dave Airlie
27380d6b3e r600/asm: add support for other GDS operations.
This adds support for the GDS operations needed to do atomic
counters.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:27:51 +10:00
Dave Airlie
ccab3f7e1b r600: don't merge GDS into VTX
We don't want vtx/tex instructions ending up in GDS sections.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00
Dave Airlie
043f16eba1 r600: for memory instructions dump index gpr for read indirects also.
This just makes sure we can see the index gpr in the asm dumps.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:21 +10:00
Dave Airlie
ac8fb9800a r600: add support for vertex fetches via texture cache
On evergreen we can route vertex fetches via the texture cache,
and this is required for some images support. So add support
to the asm builder for it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:20 +10:00
Dave Airlie
b050b91e33 r600: route indirect address register correctly for vtx fetches.
This was found during writing the images code, we need to
make sure we route the correct index register.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 16:23:20 +10:00
Dave Airlie
4a34f3244a radv/meta: don't need vertex info for resolve shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 01:24:10 +01:00
Marek Olšák
0715b3c2ee drirc: whitelist glthread for a few games
Performance deltas:
    Alien Isolation: +17% (it varies depending on the location)
    Borderlands 2: +50% (it varies depending on the location)
    BioShock Infinite: +76% (benchmark)
    Civilization 6: +20% (benchmark)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
4f38b48e05 mesa/glthread: decrease the batch size for better perf scaling
This is the key to better performance.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
09f6915bf8 gallium/hud: add glthread counters
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
8f4bc8a324 gallium/hud: add API-thread-busy for monitoring the thread load
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
11cf079b67 gallium/hud: add hud_pane::hud pointer
for later use

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
5fa69be3c8 mesa/glthread: add glthread "perf" counters and pass them to gallium HUD
for HUD integration in following commits. This valuable profiling data
will allow us to see on the HUD how well glthread is able to utilize
parallelism. This is better than benchmarking, because you can see
exactly what's happening and you don't have to be CPU-bound.

u_threaded_context has the same counters.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
833f3c1c31 gallium/hud: move struct hud_context to hud_private.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
7492201c4e gallium/hud: rename API-thread-busy to main-thread-busy
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
d1513edaa0 mesa/glthread: switch to u_queue and redesign the batch management
This mirrors exactly how u_threaded_context works.
If you understand this, you also understand u_threaded_context.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
1e37a5054b mesa/glthread: remove HAVE_PTHREAD guards
we are switching to util_queue.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Marek Olšák
6884c95ab4 util: move pipe_thread_is_self from gallium to src/util
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 02:17:03 +02:00
Bas Nieuwenhuizen
78bef01da2 radv: Remove unused args of radv_image_view_init.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-26 01:24:50 +02:00
Bas Nieuwenhuizen
789f480029 radv: Use correct image layout for blit based copies.
v2: Don't pass layout to image view usage mask.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 0628580eff "radv: Specify semantics of HTILE layout helpers."
2017-06-26 01:24:29 +02:00
Grigori Goronzy
95fb1c187a mesa/marshal: add custom marshalling for glNamedBuffer(Sub)Data
These entry points are used by Alien Isolation and caused
synchronization with glthread. The async marshalling implementation
is similar to glBuffer(Sub)Data. However unlike Buffer(Sub)Data
we don't need to worry about EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD,
as this isn't applicable to these DSA variants.

Results in an approximately 6x drop in glthread synchronizations and a
~30% FPS jump in Alien Isolation (Medium preset, Athlon 860K, RX 480).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-26 09:06:23 +10:00
Dave Airlie
6a68170c83 radv: handle primitive id input into fragment shader with no geom shader
Fixes:
dEQP-VK.pipeline.framebuffer_attachment.no_attachments
dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:30 +10:00
Dave Airlie
2a87ddbdcb radv: compile fragment shader first.
This reorders things as we need something from the fs for the vs key.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:26 +10:00
Dave Airlie
a563f611c3 radv: set prim_id for geometry shaders
Noticed in passing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:22 +10:00
Dave Airlie
4042892cee radv: set use_prim_id for tess shaders correctly.
Just noticed in passing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:14 +10:00
Pierre Moreau
afb8f2d4a3 nv50/ir: Properly fold constants in SPLIT operation
Fixes: b7d9677d ("nv50/ir: constant fold OP_SPLIT")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-25 15:23:46 +02:00
Marek Olšák
e25950808f radeonsi/gfx9: don't overallocate shader binaries
It's not needed. The hw doesn't fetch ahead over page boundaries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-24 23:04:37 +02:00
Lucas Stach
d6b9ba36a4 st/dri2: implement image offset query
This trivially adds support for the image offset query, which is needed
for the zwp_linux_dmabuf based EGL platform wayland implementation.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-06-24 16:57:55 +01:00
Samuel Pitoiset
cb577e379e mesa: only flush vertices when the viewport is different
This prevents glViewport() and friends to always flush and
trigger _NEW_VIEWPORT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-24 16:47:43 +02:00
Samuel Pitoiset
4178cea06d mesa: remove useless comments in the viewport code path
No need to explain why calling a driver callback is needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-24 16:47:38 +02:00
Roland Scheidegger
8bfe451ed3 llvmpipe: initialize default fb correctly in setup
If lp_setup_bind_framebuffer() is never called, then setup fb x1/y1 was not
correctly initialized. This can happen if there's never a fb set - both
cso and llvmpipe would consider setting this with no cbufs and no zsbuf a
redundant change and therefore it would never get set.
We rely on this setup fb rect being initialized correctly for the tri intersect
tests, throwing away tris which don't intersect. Not initializing it meant
we'd then say it intersected, and we'd try to bin that despite that we have
no actual tiles to bin it to, leading to assertion failures (pretty harmless
since tile 0/0 always exists nevertheless as tiles are statically allocated,
albeit that should change at some point).
(Note probably not an issue with gl state tracker)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-24 00:18:43 +02:00
Jason Ekstrand
f7f2fa8eb1 i965/miptree: Rework aux enabling
This commit replaces the complex and confusing set of disable flags with
two fairly straightforward fields which describe the intended auxiliary
surface usage and whether or not the miptree supports fast clears.
Right now, supports_fast_clear can be entirely derived from aux_usage
but that will not always be the case.

This commit makes functional changes.  One of these changes is that it
re-enables multisampled fast-clears which were accidentally disabled in
cec30a6669 around a year ago.  Fixing this
improves the SynMark v7 DeferredAA test by around ~3% on some gen9
hardware.  This commit also gets us closer to enabling CCS_E for
window-system buffers which are Y-tiled.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-23 12:30:24 -07:00
Jason Ekstrand
f1fa4be871 i965: Clamp clear colors to the representable range
Starting with Sky Lake, we can clear to arbitrary floats or integers.
Unfortunately, the hardware isn't particularly smart when it comes
sampling from that clear color.  If the clear color is out of range for
the surface format, it will happily return whatever we put in the
surface state packet unmodified.  In order to avoid returning bogus
values for surfaces with a limited range, we need to do some clamping.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-23 12:30:24 -07:00
Jason Ekstrand
793b312b4a i965: Don't bother with HiZ in renderbuffer_move_to_temp
This function is only used on gen4-5 which don't support HiZ.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-23 12:30:24 -07:00
Jason Ekstrand
764cce442e i965/miptree: Rename the non_msrt_mcs functions to _ccs
While we're here, we also make the two support checks static since there
are no users outside intel_mipmap_tree.c.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-23 12:30:24 -07:00
Jason Ekstrand
a7059a764e i965/miptree: Delete the layered rendering resolve
We never fast-clear more than the base slice (LOD 0, layer 0) anyway, so
layered rendering without a resolve is always perfectly safe.  Should
this ever change in the future, we'll have to put some sort of resolve
back in but we can cross that bridge when we come to it.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-23 12:30:24 -07:00
Anuj Phogat
7896dee349 anv/cnl: Don't write to Cache Mode Register 1 on gen10+
For PartialResolveDisableInVC field recommendation is to
always set this to 0 and that's the default value of the bit.
So, we have nothing left to write to CACHE_MODE_1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-23 11:16:00 -07:00
Anuj Phogat
b980553309 i965/cnl: Don't write to Cache Mode Register 1 on gen10+
With below optimizations gone in gen10+ we have nothing left out to
write to CACHE_MODE_1:
Float Blend Optimization Enable: This bit have been removed in gen10+
Partial Resolve Disable in VC: Recommendation is to always set this
field to 0 in gen10+ and that's the default value of the bit.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-23 11:16:00 -07:00
Marek Olšák
f6e98e99e3 radeonsi: unreference vertex buffers when destroying the context
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-23 19:53:54 +02:00
Edmondo Tommasina
2ea16f08f3 drirc: Add glsl_correct_derivatives_after_discard for The Witcher 2
This fixes the long-standing problem with black transitions in The Wicher 2.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98238

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
ee16796d54 radeonsi: implement the workaround for Rocket League - postponed TGSI kill
Do KILL at the end of shaders so as not to break WQM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
a98a04ec80 gallium/radeon: pass create_screen flags to r600_common_screen_init
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
118b2008ba st/dri: add a drirc workaround for Rocket League
This needs to be passed to gallium drivers.

No game fix is planned at this time.

The addition of glsl_correct_derivatives_after_discard is
generally a good thing for mesa compatibility with the broader GL
driver ecosystem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100070

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
6b0f6e693b st/dri: get drirc options before creating pipe_screen
dri_init_options_get_screen_flags will return the flags for create_screen().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
76f379330a gallium: allow passing 'unsigned flags' to create_screen()
for drirc options

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
516488bb51 mesa: don't flush vertices in glClientActiveTexture
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-23 19:50:20 +02:00
Marek Olšák
522173aee4 mesa: don't flag _NEW_ARRAY for GL_PRIMITIVE_RESTART_NV
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-23 19:50:20 +02:00
Roland Scheidegger
c7688d2de5 llvmpipe:fix using 32bit rasterization mistakenly, causing overflows
We use the bounding box (triangle extents) to figure out if 32bit rasterization
could potentially overflow. However, we used the bounding box which already got
rounded up to 0 for negative coords for this, which is incorrect, leading to
overflows and hence bogus rendering in some of our private use.

It might be possible to simplify this somehow (we're now using 3 different
boxes for binning) but I don't quite see how.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Roland Scheidegger
672d245ffe llvmpipe: fill in debug vertex info for tri rasterization
This is pretty useful for debugging rasterization issues, so turn it on
based on DEBUG (the actual existence of the fields is also conditionalized
on DEBUG, lines fill it out the same too).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Marek Olšák
c2f82fc1d3 Revert "radeonsi: don't emit partial flushes at the end of IBs (v2)"
This reverts commit c9040dc9e7.

People have reported it causes corruption on VI, and I see GPU hangs
on GFX9.
2017-06-23 19:13:55 +02:00
Samuel Pitoiset
7f7487f262 mesa: remove spurious flush in _mesa_Viewport()
I don't think this is actually required, if the viewport
values are different from the ones stored in the context, we
already flush and trigger _NEW_VIEWPORT in
set_viewport_no_notify().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-23 16:26:25 +02:00
Samuel Pitoiset
2f76b45415 mesa: remove spurious flush in _mesa_DepthRange()
I don't think this is actually required, if the depth range
values are different from the ones stored in the context, we
already flush and trigger _NEW_VIEWPORT in
set_depth_range_no_notify().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-23 16:26:25 +02:00
Samuel Pitoiset
f314a532fd mesa: do not trigger _NEW_TEXTURE_STATE in glActiveTexture()
This looks like useless because gl_context::Texture::CurrentUnit
is not used by _mesa_update_texture_state() and friends.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-23 16:26:24 +02:00
Samuel Pitoiset
c244c25ce3 mesa: add KHR_no_error support for glViewport()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
ad0afa87b8 mesa: add viewport() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
128822c59f mesa: add KHR_no_error support for glViewportArrayv()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
e1d6de7a1e mesa: add viewport_array() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
0a667f03bb mesa: add KHR_no_error support for glViewportIndexed*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
efd42b5791 mesa: rename ViewportIndexedf() to viewport_indexed_err()
While are at it, add a 'context' parameter for consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:43 +02:00
Samuel Pitoiset
52a448c7d0 mesa: add KHR_no_error support for glClipControl()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:42 +02:00
Samuel Pitoiset
5a6779c722 mesa: add clip_control() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-23 09:26:42 +02:00
Rafael Antognolli
9fd0aee17d i965: Convert upload_default_color to genxml.
This function was moved to genX_state_upload.c but was still not using genxml.
By converting it to genxml, we make some things simpler, like setting
haswell's border color state, but others are more complex, since the structs
used by each gen are different.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:51:51 -07:00
Rafael Antognolli
e547915935 i965: Remove unused code and delete file.
The sampler state code was all moved to genxml, so we can get rid of these
functions and delete the file.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:51:51 -07:00
Rafael Antognolli
e30bbe32a3 i965: Convert vs, gs, tcs, tes and cs samplers to genxml.
Since they just use the code that is already available in genX_state_upload.c,
convert them in one batch.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:51:51 -07:00
Rafael Antognolli
f8d69beed4 i965: Convert fs sampler state to use genxml.
Also convert some auxiliary functions used by it, and copy
upload_default_color to genX_state_upload.c.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:51:47 -07:00
Rafael Antognolli
9b78a52042 genxml: fix gen5 sampler border color state.
Based on the current code, gen5 and gen6 have the same sampler border color
state struct. So fix the gen5 one to match gen6.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Rafael Antognolli
f43c21cbbd aubinator: Dump sampler state pointers on gen6 too.
We already have a function to dump sampler states, so do that for gen6
too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-22 16:38:44 -07:00
Chad Versace
ecd8f85802 anv: Fix -Wswitch in anv_layout_to_aux_usage()
anv_layout_to_aux_usage() lacked a case for
VK_IMAGE_LAYOUT_SHARED_PRESENT_KHR. Add an unreachable case, because we
don't support the extension.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 15:18:24 -07:00
Chad Versace
55f335bd30 i965: Fix -Wunused-variable in gen8_write_pma_stall_bits()
Trivial fix.  'ctx' was unused.
2017-06-22 14:44:06 -07:00
Anusha Srivatsa
de7ed0ba55 i965/CFL: Add PCI Ids for Coffee Lake.
Coffee Lake has a gen9 graphics following KBL.
From 3D perspective, CFL is a clone of KBL/SKL features.

v2: Change commit message, correct alignment <Anuj Phogat>
v3: Update IDs.
v4: Initialize l3_banks, correct nomenclature <Anuj>

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Acked-by: Benjamin Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-22 14:28:43 -07:00
Anuj Phogat
43d11b128c intel: Enable vulkan build for gen10
This patch just enables building Vulkan libs for gen10. We
still don't have gen 10 support enabled on Vulkan.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:46 -07:00
Anuj Phogat
ac6bc0e034 anv/cnl: Generate and use gen10 functions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat
c17e214a6b anv/cnl: Don't set FloatBlendOptimizationEnable{Mask}
This field is remove from CACHE_MODE_1 register in gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat
bf1d2c37c6 anv/cnl: Use GENX(xx) in place of GEN9_xx
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat
1e5a5d18d1 anv/cnl: Add #defines for MOCS and genX(x)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 14:17:45 -07:00
Anuj Phogat
ceed55e7bb intel/genxml: Add Gen10 CACHE_MODE_1 definitions
Few of the fields in this register are changed as compared
to gen9.xml.

V2: Remove some fields which are not valid anymore.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat
6338b63270 intel/genxml: Rename StartInstanceLocation to StartingInstanceLocation
This is required because we already have a macro defined with
the name StartInstanceLocation.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat
8869c8b3dc intel/genxml: Rename IndirectStatePointer to BorderColorPointer
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat
97f75fdfd0 intel/genxml: Combine DataDWord{0, 1} fields in to ImmediateData field
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat
c61b909d14 intel/genxml: Add INSTDONE registers in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Anuj Phogat
03fddd3c1d intel/genxml: Add better support for MI_MATH in gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-22 14:17:45 -07:00
Chad Versace
a9e5e9f5ec i965/dri: Add intel_screen param to intel_create_winsys_renderbuffer
The param is currently unused. It will later be used it to support
R8G8B8X8 EGLConfigs on Skylake.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-22 12:44:28 -07:00
Chad Versace
4b9cbfa0b0 i965: Move brw_context format arrays to intel_screen
This allows us to query the driver's supported formats in i965's DRI code,
where often there is available a DRIscreen but no GL context.

To reduce diff noise, this patch does not completely remove
brw_context's format arrays. It just redeclares them as pointers which
point to the arrays in intel_screen.

Specifically, move these two arrays from brw_context to intel_screen:
    mesa_to_isl_render_format[]
    mesa_format_supports_render[]

And add a new array to intel_screen,
    mesa_format_supportex_texture[]
which brw_init_surface_formats() copies to ctx->TextureFormatSupported.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-22 12:44:28 -07:00
Chad Versace
c09b2aefae i965: Rename some vague format members of brw_context
I'm swimming in a vortex of formats. Mesa formats, isl formats, DRI
formats, GL formats, etc.

It's easy to misinterpret the following brw_context members unless
you've recently read their definition.  In upcoming patches, I change
them from embedded arrays to simple pointers; after that, even their
definition doesn't help, because the MESA_FORMAT_COUNT hint will no
longer be present.

Rename them to prevent further confusion. While we're renaming, choose
shorter names too.

    -format_supported_as_render_target
    +mesa_format_supports_render

    -render_target_format
    +mesa_to_isl_render_format

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-22 12:43:53 -07:00
Chad Versace
ffbf50b1c6 egl: Rename 'count' in ${platform}_add_configs_for_visuals (v2)
Rename 'count' to 'config_count'. I didn't understand what the variable
did until I untangled the for-loops. Now the next person won't have that
problem.

v2: Rebase. Fix typo. Apply to all platforms (for emil).

Reviewed-by: Eric Engestrom <eric@engestrom.ch>  (v1)
2017-06-22 12:35:49 -07:00
Chad Versace
a6fad55961 egl/x11: Declare EGLConfig attrib array inside loop
No behavioral change. Just a readability cleanup.

Instead of modifying this small array on each loop iteration, we now
initialize it in-place with the values it needs.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
f8ad7f4054 egl/drm: Declare EGLConfig attrib array inside loop
No behavioral change. Just a readability cleanup.

Instead of modifying this small array on each loop iteration, we now
initialize it in-place with the values it needs.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
bd789098a5 egl/android: Declare EGLConfig attrib array inside loop (v2)
No behavioral change. Just a readability cleanup.

Instead of modifying this small array on each loop iteration, we now
initialize it in-place with the values it needs.

v2: Rebase.

Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
2017-06-22 12:35:49 -07:00
Chad Versace
cd717cbe1a egl/dri2: Declare loop vars inside the loop
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.
This patch touches only egl_dri2.c.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
98497dfd6a egl/wayland: Declare loop vars inside the loop
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
927625ca60 egl/surfaceless: Move loop vars inside the loop
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
263d4b8b1c egl/x11: Declare loop vars inside the loop
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
c31146f080 egl/drm: Move loop vars inside the loop
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-22 12:35:49 -07:00
Chad Versace
09455123f3 egl/android: Declare loop vars inside their loops (v2)
That is, consistently do this:

    for (int i = 0; ...)

No behavioral change.

v2: Rebase.

Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
2017-06-22 12:35:49 -07:00
Brian Paul
9e57a2cbcf svga: minor whitespace fixes in svga_pipe_vertex.c 2017-06-22 13:33:48 -06:00
Brian Paul
041f8ae9f6 svga: check return value from svga_set_shader( SVGA3D_SHADERTYPE_GS, NULL)
If the call fails we need to flush the command buffer and retry.  In this
case, we were failing to unbind the GS which led to subsequent errors.

This fixes a bug replaying a Cinebench R15 apitrace in a Linux guest.
VMware bug 1894451

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-22 13:33:48 -06:00
Charmaine Lee
3fbdab8778 svga: fix pre-mature flushing of the command buffer
When surface_invalidate is called to invalidate a newly created surface
in svga_validate_surface_view(), it is possible that the command
buffer is already full, and in this case, currently, the associated wddm
winsys function will flush the command buffer and resend the invalidate
surface command. However, this can pre-maturely flush the command buffer
if there is still pending image updates to be patched.

To fix the problem, this patch will add a return status to the
surface_invalidate interface and if it returns FALSE, the caller will
call svga_context_flush() to do the proper context flush.
Note, we don't call svga_context_flush() if surface_invalidate()
fails when flushing the screen surface cache though, because it is
already in the process of context flush, all the image updates are already
patched, calling svga_context_flush() can trigger a deadlock.
So in this case, we call the winsys context flush interface directly
to flush the command buffer.

Fixes driver errors and graphics corruption running Tropics. VMware bug 1891975.

Also tested with MTT glretrace, piglit and various OpenGL apps such as
Heaven, CinebenchR15, NobelClinicianViewer, Lightsmark, GoogleEarth.

cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-22 13:33:48 -06:00
George Kyriazis
08cb8cf256 swr: invalidate attachment on transition change
Consider the following RT attachment order:
1. Attach surfaces attachments 0 & 1, and render with them
2. Detach 0 & 1
3. Re-attach 0 & 1 to different surfaces
4. Render with the new attachment

The definition of a tile being resolved is that local changes have been
flushed out to the surface, hence there is no need to reload the tile before
it's written to.  For an invalid tile, the tile has to be reloaded from
the surface before rendering.

Stage (2) was marking hot tiles for attachements 0 & 1 as RESOLVED,
which means that the hot tiles can be written out to memory with no
need to read them back in (they are "clean").  They need to be marked as
resolved here, because a surface may be destroyed after a detach, and we
don't want to have un-resolved tiles that may force a readback from a
NULL (destroyed) surface.  (Part of a destroy is detach all attachments first)

Stage (3), during the no att -> att transition, we  need to realize that the
"new" surface tiles need to be fetched fresh from the new surface, instead
of using the resolved tiles, that belong to a stale attachment.

This is done by marking the hot tiles as invalid in stage (3), when we realize
that a new attachment is being made, so that they are re-fetched during
rendering in stage (4).

Also note that hot tiles are indexed by attachment.

- Fixes VTK dual depth-peeling tests.
- No piglit changes

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-22 11:51:08 -05:00
Juan A. Suarez Romero
87a2d3963a Revert "getteximage: Return correct error value when texure object is not found"
From OpenGL 4.5 spec PDF, section '8.11. Texture Queries', page 236:
  "An INVALID_VALUE error is generated if texture is not the name of
   an existing texture object."

Same wording applies to the compressed version.

But turns out this is a spec bug, and Khronos is fixing it for the next
revisions.

The proposal is to return INVALID_OPERATION in these cases.

This reverts commit 633c959fae.

v2:
- Use _mesa_lookup_texture_err (Samuel Pitoiset)

v3:
- _mesa_lookup_texture_err() already handles texture > 0 (Samuel
Pitoiset)
- Just revert 633c959fae (Juan A. Suarez)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-06-22 18:48:18 +02:00
Eric Engestrom
c87f73724e egl: properly count configs
dri2_conf represents another config (which shouldn't be counted)
if it doesn't have the requested ID.

Reported-by: Liu Zhiquan <zhiquan.liu@intel.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-22 17:32:31 +01:00
Chad Versace
5e884353e6 egl/android: Change order of EGLConfig generation (v2)
Many Android apps (such as Google's official NDK GLES2 example app), and
even portions the core framework code (such as SystemServiceManager in
Nougat), incorrectly choose their EGLConfig.  They neglect to match the
EGLConfig's EGL_NATIVE_VISUAL_ID against the window's native format, and
instead choose the first EGLConfig whose channel sizes match those of
the native window format while ignoring the channel *ordering*.

We can detect such buggy clients in logcat when they call
eglCreateSurface, by detecting the mismatch between the EGLConfig's
format and the window's format.

As a workaround, this patch changes the order of EGLConfig generation
such that all EGLConfigs for HAL pixel format i precede those for HAL
pixel format i+1. In my (chadversary) testing on Android Nougat, this
was good enough to pacify the buggy clients.

v2: Rebase to make patch cherry-pickable to stable.

Cc: mesa-stable@lists.freedesktop.org
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-22 08:58:45 -07:00
Ville Syrjälä
1c409fe4c1 i915: Fix gl_Fragcoord interpolation
gl_FragCoord contains the window coordinates so it seems to me that
we should not use perspective correct interpolation for it. At least
now I get similar output as i965/swrast/llvmpipe produce.

This fixes dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_w.
dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz was already
passing, though I'm not quite sure how it managed to do that.

v2: Add definitons for the S3 "wrap shortest" bits as well (Ian)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-22 17:26:17 +03:00
Eric Engestrom
b81cfc7340 egl: simplify dri_config conditionals
In the same spirit as 858f2f2ae6 (egl/dri2: ease srgb __DRIconfig
conditionals), let's merge dri_single_config and dri_double_config into
a single dri_config[2].

This moves the `if (double) dri_double_config else dri_single_config`
logic to `dri_config[double]`, reducing code duplication and making it
easier to read.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-22 14:54:36 +01:00
Marek Olšák
bcd67b1711 radeonsi/gfx9: enable DCC fast clear
It seems to work now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
db37c0be13 radeonsi/gfx9: don't ever flush the TC metadata cache
The closed Vulkan driver doesn't do it either.

Also remove some old comments that aren't useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
920f20f039 radeonsi/gfx9: use TC L2 for fast color clear with CP DMA
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
c1754b69dc radeonsi: fix DCC fast clear for luminance and alpha formats
I reproduced this bug on Polaris11 and Raven.

I can't get this bug on Fiji. The reason might be that Fiji doesn't use
2D tiling for the test due to higher 2D tiling alignment requirements.

Fixes piglit: spec@ext_framebuffer_object@fbo-fast-clear

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Marek Olšák
c9040dc9e7 radeonsi: don't emit partial flushes at the end of IBs (v2)
The kernel sort of does the same thing with fences.

v2: do emit partial flushes on SI

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 13:15:27 +02:00
Andres Gomez
5352174d49 anv: FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR not used with VkFormatProperties.bufferFeatures
VK_FORMAT_FEATURE_TRANSFER_[SRC|DST]_BIT_KHR is a flag value of the
VkFormatFeatureFlagBits enum that can only be hold and checked against
the linearTilingFeatures or optimalTilingFeatures members of the
VkFormatProperties struct but not the bufferFeatures member.

>From the Vulkan® 1.0.51, with the VK_KHR_maintenance1 extension,
section 32.3.2 docs for VkFormatProperties:

   "* linearTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_LINEAR.

    * optimalTilingFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by images created with a tiling
      parameter of VK_IMAGE_TILING_OPTIMAL.

    * bufferFeatures is a bitmask of VkFormatFeatureFlagBits
      specifying features supported by buffers."

    ...

    Bits which can be set in the VkFormatProperties features
    linearTilingFeatures, optimalTilingFeatures, and bufferFeatures
    are:

    typedef enum VkFormatFeatureFlagBits {

    ...

      VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR = 0x00004000,
      VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR = 0x00008000,

    ...

    } VkFormatFeatureFlagBits;

    ...

    The following bits may be set in linearTilingFeatures and
    optimalTilingFeatures, specifying that the features are supported
    by images or image views created with the queried
    vkGetPhysicalDeviceFormatProperties::format:

    ...

    * VK_FORMAT_FEATURE_TRANSFER_SRC_BIT_KHR specifies that an image
      can be used as a source image for copy commands.

    * VK_FORMAT_FEATURE_TRANSFER_DST_BIT_KHR specifies that an image
      can be used as a destination image for copy commands and clear
      commands."

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Iago Toral Quiroga <itoral@igalia.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-22 13:45:22 +03:00
Chandu Babu N
1d4cbcdf28 change va max_entrypoints
As encode support is added along with decode, increase max_entrypoints to two.
vaMaxNumEntrypoints was returning incorrect value and causing
memory corruption before this commit

v2: assert when max_entrypoints needs to be bigger

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-22 12:10:57 +02:00
Chandu Babu N
b1a359b7d8 st/va: Fix leak in VAAPI subpictures
sampler view allocated in vaAssociateSubpicture is not cleared
in vaiDeassociateSubpicture.

Reviewed-by: Christian König <christian.koenig@amd.com>
2017-06-22 12:09:43 +02:00
Timothy Arceri
9e9f7840bd glsl: tidy up int declaration
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-22 20:06:38 +10:00
Timothy Arceri
95927bb27f glsl: fix typo in comment
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-22 20:06:32 +10:00
Samuel Pitoiset
a285caaf25 mesa: fix using texture id 0 with glTextureSubImage*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:36 +02:00
Samuel Pitoiset
45eb87e5e5 mesa: fix using texture id 0 with gl*TextureParameter*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:30 +02:00
Samuel Pitoiset
7f47c31f8c mesa: fix using texture id 0 with VDPAURegisterSurfaceNV()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:22 +02:00
Samuel Pitoiset
51a7e0d14f mesa: fix using texture id 0 with glTextureStorage*()
This fixes an assertion in debug build, and probably a crash
in release build.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:19 +02:00
Samuel Pitoiset
1f38363e68 mesa: pass the 'caller' function to texturestorage() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:17 +02:00
Samuel Pitoiset
8a7ab8d418 mesa: use _mesa_lookup_texture_err() in get_tex_obj_for_clear()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:15 +02:00
Samuel Pitoiset
048de9e34a mesa: remove unused _mesa_delete_nameless_texture()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:13 +02:00
Samuel Pitoiset
75044f0854 mesa: check for allocation failures in _mesa_new_texture_object()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 10:41:10 +02:00
Nicolai Hähnle
da2e52b382 radeonsi: use the correct LLVMTargetMachineRef in si_build_shader_variant
si_build_shader_variant can actually be called directly from one of
normal-priority compiler threads. In that case, the thread_index is
only valid for the normal tm array.

v2:
- use the correct sel/shader->compiler_ctx_state

Fixes: 86cc809726 ("radeonsi: use a compiler queue with a low priority for optimized shaders")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-22 09:45:23 +02:00
Marek Olšák
79bd1d4f8b radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fence
instead of using a monotonic suballocator

v2: initialize the memory at context creation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c66fc618cc radeonsi/gfx9: enable the constant engine
I think this kernel commit fixes it:
     drm/amdgpu:use FRAME_CNTL for new GFX ucode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
d7141d8bc0 radeonsi/gfx9: indirect buffers and all CP packets use TC L2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2638250fec radeonsi: flush CB after MSAA only when transitioning from CB to textures
The main flush before texturing is done after the FMASK decompress pass.

CB after MSAA rendering is not flushed in set_framebuffer_state and also
not in memory_barrier if the current color buffer is MSAA. We fully rely
on the FMASK decompress pass for the flushing.

Some CB decompress and resolve passes need an explicit flush before and
after.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
51c219739c radeonsi: unify CB_RESOLVE blitter invocation code
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2263610827 radeonsi: flush DB caches only when transitioning from DB to texturing
Use the mechanism of si_decompress_textures, but instead of doing
the actual decompression, just flag the DB cache flush there.

This removes a lot of unnecessary DB cache flushes.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
fdca690e91 radeonsi: add separate HUD counters for CB and DB cache flushes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
b007744051 st/mesa: don't set the border color if it's unused
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
743ad599a9 st/mesa: don't set 16 scissors and 16 viewports if they're unused
Only do so if there is a shader writing gl_ViewportIndex.
This removes a lot of CPU overhead for the most common case.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2ec1e32d11 st/mesa: fix pipe_rasterizer_state::scissor with multiple viewports
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
d7e52327f0 st/mesa: simplify st_update_viewport
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
a8753b254d st/mesa: remove redundant sample_mask checking
cso does that too

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
2108b73cf3 st/mesa: use precomputed st_fb_orientation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
91579254db mesa: don't call _mesa_update_clip_plane in the GL core profile
It uses the projection matrix to transform the clip plane.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-22 01:51:02 +02:00
Marek Olšák
602a3e50e5 st/mesa: set st_context::...num_samplers to 0 when there are no samplers
This was missed during my st/mesa series.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
f368ea37a2 st/mesa: unify fail paths for update_single_texture
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
d14bb37a0a st/mesa: don't call u_sampler_view_default_template for sampler views
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
bfe1e7737a st/mesa: always set sampler swizzle according to the texture base format
Mainly don't (indirectly) call util_format_description here.

If the driver supports texture swizzling, this will always do the right
thing. If the driver doesn't support it, it doesn't matter.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
25723857d9 st/mesa: samplers only need to track whether GLSL >= 130
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
3ee1c9b126 st/mesa: simplify get_texture_format_swizzle
- Don't check GL_NONE (that was only for buffers).
- Don't use util_format_is_depth_or_stencil.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
f0ecd36ef8 st/mesa: add an entirely separate codepath for setting up buffer views
Remove handling of buffers from all texture paths.
This simplifies things for both buffers and textures.

get_sampler_view_format is also cleaned up not to call
util_format_is_depth_and_stencil.

v2: also update st_NewTextureHandle

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2017-06-22 01:51:02 +02:00
Marek Olšák
fbd9cc6169 st/mesa: don't return an error from update_single_texture
It can just return a NULL sampler view, which is better than not doing
anything at all.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
6f4ead6bfd st/mesa: clean up trivial dereferences in update_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
5766c18f59 st/mesa: don't check MaxTextureImageUnits in update_textures
The linker takes care of it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
4c0bce921b st/mesa: don't call st_shader_stage_to_ptarget in update_textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c9a16fde80 cso: inline a few frequently-used functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
0e0fc1ce71 cso: don't return errors from sampler functions
No code checks the errors.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
4d6fab245e cso: don't track the number of sampler states bound
This removes 2 loops from hot codepaths and adds 1 loop to a rare codepath
(restore_sampler_states), and makes sanitize_hash() slightly worse.

Sampler states, when bound, are not unbound for draw calls that don't need
them. That's OK, because bound sampler states don't add any overhead.

This results in lower CPU overhead in most cases.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c845984690 st/mesa: sink and simplify texBaseFormat getting for sampler states
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
8aba778fa2 st/mesa: don't set sampler states for TBOs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
222a910a9b st/mesa: optimize sampler state translation code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
39ab9fb36c st/mesa: sink code needed for apply_texture_swizzle_to_border_color
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
588371b772 st/mesa: simplify update_shader_samplers
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
18d498a1ae st/mesa: when binding sampler states, don't check the max sampler limit
The GLSL linker takes care of it.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
fd86876fe4 st/mesa: don't unbind sampler states if none are used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
b43c887a9b st/mesa: unify update_gp/tcp/tep code
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
56a28ace35 st/mesa: don't search through shader variants if there is only one
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
dbf1413014 st/mesa: don't track shader variants in st_context
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
1c818fff0c st/mesa: move blend color into its own state atom
This is now sensible thanks to the NewBlendColor flag.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
0b03d82f9c st/mesa: check correctly if multisampling is enabled
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
9c499e6759 st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs
This is a v2 of the previous patch (v1 didn't skip st_finalize_texture).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c0ed52f614 mesa: simplify _mesa_is_image_unit_valid for buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
caf39d6df9 mesa: don't flag _NEW_PROGRAM_CONSTANTS for GLSL programs for st/mesa
v2: also update _mesa_uniform_handle for bindless textures

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-22 01:51:02 +02:00
Kenneth Graunke
b7ba745032 glsl: Track whether uniforms are active per stage
for finer granularity state flagging

v2: Marek - use a bitmask, add shader cache support

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
670c4dd395 mesa: don't flag _NEW_PROGRAM_CONSTANTS for non-GLSL programs for st/mesa
This has the benefit that we get to set up constants for exactly
the shader stage that needs it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
0b70d6ec56 mesa: flush vertices before updating ctx->_Shader
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
0160a59f29 mesa: set driver flags for glPopAttrib(GL_ENABLE_BIT) properly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
df0f6a0af3 mesa: don't flag _NEW_POLYGON_STIPPLE for st/mesa
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
58a02196b9 mesa: don't flag _NEW_LINE for st/mesa
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
bc871a1baf mesa: don't flag _NEW_POLYGON for st/mesa
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
00173d91b7 mesa: don't flag _NEW_TRANSFORM for st/mesa if possible
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
55f1106637 mesa: don't flag _NEW_TRANSFORM for Transform.RasterPositionUnclipped
It's not a driver state, it's for glRasterPos.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
1f3dc332f5 mesa: don't flag _NEW_TRANSFORM for primitive restart
It's a draw state.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
29db5c1dcc mesa: don't flag _NEW_VIEWPORT for st/mesa if possible
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c8363eb027 mesa: flush vertices before changing viewports
Cc: 17.1 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
86840d3f08 mesa: don't flag _NEW_MULTISAMPLE for st/mesa
There are several new driver flags here so that it maps nicely to gallium.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
581d77315b mesa: don't flag _NEW_COLOR for st/mesa if possible
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
b677e96078 mesa: use DriverFlags.NewAlphaTest to communicate alphatest changes to st/mesa
Now AlphaFunc avoids the blend state update in st/mesa and avoids
_mesa_update_state_locked.

The GL_ALPHA_TEST enable won't trigger blend state updates in st/mesa
after st/mesa stops relying on _NEW_COLOR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
37b834923d mesa: don't flag _NEW_DEPTH for st/mesa
skipping _mesa_update_state_locked

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
a9315627bc mesa: make _mesa_set_varying_vp_inputs a no-op in GL core profile
just don't set _NEW_VARYING_VP_INPUTS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
95d9fdc6f8 mesa: remove _NEW_BUFFER_OBJECT
not used

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
be15028ede mesa: don't flag _NEW_SCISSOR for st/mesa
Not needed and we get to bypass _mesa_update_state_locked that would be
a no-op.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
124df5d71a mesa: don't execute most of _mesa_update_state_locked for GL core profile
There is plenty of legacy stuff here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
9c1a6a2082 mesa: simplify handling the return value of update_program
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
bc4e914f95 mesa: simplify a loop in _mesa_update_texture_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
68a0e15f44 mesa: replace VP/FP/ATIfs _Enabled flags with helper functions
These are only used in the GL compatibility profile.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
585c5cf8a5 mesa: don't update draw buffer bounds in _mesa_update_state
st/mesa doesn't need the draw bounds for draw calls. I've added the call
where it's necessary in core Mesa and drivers, but I suspect that most
drivers can just move the call to the right places.

The core Mesa places aren't hot paths, so the call overhead doesn't matter
there.

For now, only st/mesa is made such that this function is invoked very
rarely.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
ab784e0fee mesa: remove update_framebuffer_size
For the default framebuffer, _mesa_resize_framebuffer updates it.
For FBOs, _mesa_test_framebuffer_completeness updates it.

This code is redundant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
e7a091936f mesa: replace ctx->Polygon._FrontBit with a helper function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c19b08b079 mesa: replace ctx->VertexProgram._TwoSideEnabled with a helper function
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
480bf7731b mesa: stop using _NEW_STENCIL with st/mesa, use DriverFlags.NewStencil instead
This bypasses _mesa_update_state_locked.

Before:
   DrawElements ( 1 VBOs, 4 UBOs,  8 Tex) w/ stencil enable change:    3.99 million
   DrawArrays   ( 1 VBOs, 4 UBOs,  8 Tex) w/ stencil enable change:    4.56 million

After:
   DrawElements ( 1 VBOs, 4 UBOs,  8 Tex) w/ stencil enable change:    4.93 million
   DrawArrays   ( 1 VBOs, 4 UBOs,  8 Tex) w/ stencil enable change:    5.84 million

It's quite a difference in the draw call rate when ctx->NewState stays
equal to 0 the whole time.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:51:02 +02:00
Marek Olšák
c2408838c8 mesa: replace _mesa_update_stencil() with helper functions
The idea is to remove the dependency on _mesa_update_state_locked,
so that st/mesa can skip it for stencil state updates, and then stop
setting _NEW_STENCIL in mesa/main if the driver is st/mesa.

The main motivation is to stop invoking _mesa_update_state_locked for
certain state groups.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-22 01:48:30 +02:00
Marek Olšák
d28cc798bd meta: do the full FBO completeness check in decompress_texture_image
_mesa_update_state will no longer recompute Width/Height if the framebuffer
is complete. We now rely on the FBO completeness check to do it.

The only code that needs to be fixed seems to be this one.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-22 01:48:30 +02:00
Pohjolainen, Topi
6a86795a3d i965/gen6: Use isl-based miptree also for stencil rbs
Fixes dEQP-EGL.functional.image.render_multiple_contexts.
gles2_renderbuffer_stencil_stencil_buffer

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-21 16:25:24 -07:00
Ian Romanick
c06b1d3c16 i965: Remove spurious mutex frobbing around call to intel_miptree_blit
These locks were added in 2f28a0dc, but I don't see anything in the
intel_miptree_blit path that should make this necessary.

When asked, Kristian says:

    I doubt it's needed now with the new blorp. If I remember correctly,
    I had to drop the lock there since intel_miptree_blit() could hit
    the XY blit path that requires a fast clear resolve. The fast
    resolve being meta, would then try to lock the texture again.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2017-06-21 14:34:56 -07:00
Eric Engestrom
4a1238a452 egl: turn one more boolean int into a bool
Same as the previous commit, but this one was split out because it's
a bit more complicated: this field is given as a pointer to a function,
so the function had to be changed as well, and the function was use in
a bunch of places, which needed updating as well.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-21 21:42:14 +01:00
Eric Engestrom
60f984262c egl: turn boolean ints into bools
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-21 21:42:08 +01:00
Jason Ekstrand
17918a0372 i965/miptree: Move isl_surf_get_(hiz|mcs)_surf out of the assert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101535
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101538
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101539
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-21 11:21:19 -07:00
Rafael Antognolli
78b843af3c intel/genxml: Use the same naming convention for Floating Point Mode.
In newer gens, this field has a prefix and the non-IEEEE-745 mode is called
"Alternate", instead of simply "Alt".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
ce728594fd intel/genxml: Normalize URB Data field in WM_STATE.
On gen6+, this is called "Dispatch GRF Start Register For Constant/Setup Data
0", while on gen5 and lower it's called only "Dispatch GRF Start Register For
URB Data", but it's essentially the same thing (URB data), so rename it to
match newer gens and simplify the C code that handles it.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
44415056e7 intel/genxml: Rename field on WM_STATE to match gen6+.
"Pixel Shader Kill Pixel" -> "Pixel Shader Kills Pixel", which is how it's
called on newer gens.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
82c66965ac intel/genxml: Normalize fields on WM_STATE.
On gen4, WM_STATE only has one Kernel Start Pointer and one GRF Register
Count, but we can make the code that handles this on multiple gens simpler if
we add an index 0 to it too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
eddb1ebccf intel/genxml: Add missing field to CLIP_STATE.
Just because it's not set doesn't mean that it doesn't exist. And since the
field is there on newer gens, having it on gen5 simplifies the code when
porting gen5 and lower.

Also add missing value to API Mode on CLIP_STATE on gen4.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
9a5ae19cbb intel/genxml: Fix type of UserClipFlags ClipTest Enable Bitmask.
This is a bitmask, so it can't be a boolean. Also rename it so it matches
gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
19d1defcd5 intel/genxml: Add missing fields to CLIP_STATE on gen4-5.
These fields are set by brw_clip_unit, so we need them when converting to
genxml.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Rafael Antognolli
faa4f5c42d intel/genxml: Normalize GS_STATE.
Rename "Rendering Enable" to "Rendering Enabled", so it matches gen6+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-21 10:16:05 -07:00
Ville Syrjälä
0eef03a6f2 i915: Always emit W on gen3
Unlike the older gen2 hardware, gen3 performs perspective
correct interpolation even for the primary/secondary colors.
To do that it naturally needs us to emit W for the vertices.

Currently we emit W only when at least one texture coordinate
set gets emitted. This means the interpolation of color will
change depending on whether texcoords/varyings are used or not.
That's probably not what anyone would expect, so let's just
always emit W to get consistent behaviour. Trying to avoid
emitting W seems like more hassle than it's worth, especially
as bspec seems to suggest that the hardware will perform the
perspective division anyway.

This used to be broken until it was accidentally fixed it in
commit c349031c27 ("i915: Fix texcoord vs. varying collision
in fragment programs") by introducing a bug that made the driver
always emit W. After fixing that bug in commit c1eedb43f3
("i915: Fix wpos_tex vs. -1 comparison") we went back to the
old behaviour and caused an apparent regression.

Fixes: c1eedb43f3 ("i915: Fix wpos_tex vs. -1 comparison")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101451
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-21 13:10:58 +03:00
Samuel Pitoiset
26fbdb12f4 mesa: add KHR_no_error support for glStencilOp()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:26 +02:00
Samuel Pitoiset
5407662570 mesa: add stencil_op() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:24 +02:00
Samuel Pitoiset
e6659c560a mesa: add KHR_no_error support for glStencilFunc()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:22 +02:00
Samuel Pitoiset
db967dcb05 mesa: add stencil_func() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:18 +02:00
Samuel Pitoiset
b9e2d5c18d mesa: add KHR_no_error support for glStencilOpSeparate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:16 +02:00
Samuel Pitoiset
0614b7a6f7 mesa: add stencil_op_separate() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:14 +02:00
Samuel Pitoiset
d222e14ffa mesa: add KHR_no_error support for glStencilMaskSeparate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:12 +02:00
Samuel Pitoiset
8ab0aaa350 mesa: add stencil_mask_separate() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:10 +02:00
Samuel Pitoiset
9c49c9d8dd mesa: add KHR_no_error support for glStencilFuncSeparate()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:09 +02:00
Samuel Pitoiset
6f10d93ea4 mesa: add stencil_func_separate() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-21 08:47:06 +02:00
Lucas Stach
629003b5b8 etnaviv: fix blend color for RB swapped rendertargets
Same as with the colormasks, the blend color needs to be swizzled according
to the rendertarget format.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-21 07:45:15 +02:00
Jason Ekstrand
1bd0acab21 spirv: Work around the Doom shader bug
Doom shipped with a broken version of GLSLang which handles samplers as
function arguments in a way that isn't spec-compliant.  In particular,
it creates a temporary local sampler variable and copies the sampler
into it.  While Dave has had a hack patch out for a while that gets it
working, we've never landed it because we've been hoping that a game
update would come out with fixed shaders.  Unfortunately, no game update
appears on to be on the horizon and I've found this issue in yet another
application so I think we're stuck working around it.  Hopefully, we can
delete this code one day.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99467
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-20 18:51:26 -07:00
Ian Romanick
93055576ae glsl: Update build instructions for int64.glsl
Trivial

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-20 17:45:49 -07:00
Elie Tournier
7e46be3dec glsl: Fix indent in dump code
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-20 17:45:49 -07:00
Ilia Mirkin
8754c5359f st/xvmc: deal with drivers wanting different texture formats
Previously, texture formats were being used unconditionally without
checking. However nv30 supports neither RGBX8 nor R4A4/A4R4 formats. Add
sufficient fallbacks so that the nv30 driver can have working OSD.

Tested on a NV44A/PCI.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-20 20:20:55 -04:00
Ben Skeggs
7bae3ef812 nvc0: fix transfer of larger rectangles with DmaCopy on gk104 and up
By treating the rectangles as 1cpp, we can run up against some internal
copy engine limits and trigger a MEM2MEM_RECT_OUT_OF_BOUNDS error check
at launch time.

This commit enables the REMAP hardware, which allows us to specify both
the component size and number of components for a transfer.  We're then
able to pass in the real width/nblocksx values and not hit the limits.

There's a couple of "supported" CPPs in the list that we can't actually
hit, but are there simply because they're possible.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-20 20:18:54 -04:00
Ben Skeggs
ec3d489d5b nvc0: copy engine surface params are only relevant for tiled surfaces
Aside from reducing pushbuf usage in some situations, this commit should
have no other effect, and is just to make it somewhat obvious that those
methods have zero effect on linear surfaces.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-20 20:18:54 -04:00
Dave Airlie
72c8c68458 st/mesa: fix assert to be simpler
I just noticed a warning with a non-debug build, but really
this could all be one line, and I'm not even 100% the assert
makes sense here.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-21 08:59:18 +10:00
Lionel Landwerlin
030abc6109 intel: compiler/i965: fix is_broxton checks
In 5f2fe9302c is_geminilake was introduced for the differenciate
broxton from geminilake. Unfortunately I failed as verifying that
is_broxton is throughout the code base to mean Gen9lp.

Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-20 23:26:42 +01:00
Plamena Manolova
b3b6121115 mesa/main: Move NULL pointer check.
In blit_framebuffer we're already doing a NULL
pointer check for readFb and drawFb so it makes
sense to do it before we actually use the pointers.

CID: 1412569
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-20 13:57:20 -07:00
George Kyriazis
12f52942f8 swr: Include definition of missing function
Inline function SWR_MULTISAMPLE_POS::PrecalcSampleData() was missing
definition.  Include definition in core/state_funcs.h.

Fixes windows build.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-20 14:42:42 -05:00
Ben Widawsky
3e1055591b i965/cnl: Add l3 configuration for Cannonlake
V2 (Anuj):
Squash the changes in one patch rebase on master.
Address the review comments made by Francisco Jerez.
Do the URB allocation per slice (not per bank).

V3 (Anuj):
Update the comment.
Format the table as other l3 config tables.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
---
V1 was sent out with the heading:
"i965/cnl: Properly handle l3 configuration"
2017-06-20 12:18:26 -07:00
Anuj Phogat
1024dad4d9 i965: Add a variable for way size per bank in get_l3_way_size()
Adding this variable better explains the computation of L3 way
size in the function.

V2: Use const variable for way_size_per_bank.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Anuj Phogat
8521559e08 i965: Fix broxton 2x6 l3 config
The new table added in this patch matches with the table
in gfxspecs. We were programming the wrong values earlier.

V2: Update the comment.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-20 12:18:26 -07:00
Ian Romanick
4eb3747544 i965: Fall back to normal blorp clear instead of meta clear
When intel_miptree_alloc_non_msrt_mcs fails, fall back to normal blorp
color clear instead of falling back to meta.  With this change,
brw_blorp_clear_color can never fail.

v2: Combine two if-statements to remove a level of indentation.
Suggested by Jason.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick
cbb941cdec intel/blorp: Apply source offset in the TEX case
Previously the offset was only applied in the TXF case.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick
990f2be139 intel/blorp: Apply Gen4 coord. normalization after cubemap sizes are adjusted
Otherwise the values used for coordinate normalization use the wrong
sizes.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Jason Ekstrand
b2dd61196e intel/blorp: Set needs_(dst|src)_offset for Gen4 cubemaps
We call convert_to_single_slice so they may end up with a non-trivial
offset that needs to be taken into account.

v2 (idr): Also set needs_src_offset.  Suggested by Jason.

Fixes ES2-CTS.functional.texture.specification.basic_copyteximage2d.cube_rgba
and ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba
on G45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101284
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick
cc14286930 meta/blit: Silence unused parameter warning
drivers/common/meta_blit.c: In function ‘setup_glsl_msaa_blit_scaled_shader’:
drivers/common/meta_blit.c:62:58: warning: unused parameter ‘filter’ [-Wunused-parameter]
                                    GLenum target, GLenum filter)
                                                          ^~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick
37164da272 meta: Silence unused parameter warning
drivers/common/meta.c:2694:71: warning: unused parameter ‘dims’ [-Wunused-parameter]
 copytexsubimage_using_blit_framebuffer(struct gl_context *ctx, GLuint dims,
                                                                       ^~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:02 -07:00
Ian Romanick
691beaf241 i965: Fix incorrect comment
There is no intel_miptree_slice_has_hiz function, but there is a
intel_miptree_level_has_hiz function.  I assume that's the correct one
to use.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-20 11:07:01 -07:00
Samuel Pitoiset
4f00b2bc7e mesa: simplify _mesa_IsVertexArray()
_mesa_lookup_vao() already returns NULL if id is zero.

v2: - change the conditional (Ian)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v1)
2017-06-20 19:15:17 +02:00
Eric Engestrom
cb3e01ca71 mesa/format_info: use designated initialiser list
Also, make that table const, since no-one is supposed to modify it anyway.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-20 17:55:13 +01:00
Eric Anholt
45b0172693 vc4: Clean up release build warnings using MAYBE_UNUSED.
These variables are all used in an assert(), so release builds see no
usages.
2017-06-20 09:09:09 -07:00
Eric Anholt
743dcdd936 vc4: Allow VBOs to be mapped during execution.
There's no reason we can't -- the mappings we expose are basically
equivalent to persistent/coherent, already.

Improves mesa-demos drawoverhead (no state change) performance by
5.21362% +/- 1.25078% (n=11).
2017-06-20 09:05:44 -07:00
Brian Paul
d8148ed10a gallium/vbuf: avoid segfault when we get invalid glDrawRangeElements()
A common user error is to call glDrawRangeElements() with the 'end'
argument being one too large.  If we use the vbuf module to translate
some vertex attributes this error can cause us to read past the end of
the mapped hardware buffer, resulting in a crash.

This patch adjusts the vertex count to avoid that issue.  Typically,
the vertex_count gets decremented by one.

This fixes crashes with the Unigine Tropics and Sanctuary demos with older
VMware hardware versions.  The issue isn't hit with VGPU10 because we
don't hit this fallback.

No piglit changes.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-20 08:03:18 -06:00
Brian Paul
2a9d8a45a6 gallium/vbuf: add some const qualifiers
Helps understandability a bit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-20 08:03:12 -06:00
Brian Paul
ed83e73c4e translate: whitespace fixes in translate_generic.c 2017-06-20 07:56:34 -06:00
Brian Paul
ceb9ca7fa5 softpipe: remove unused softpipe_context::line_stipple_counter
Trivial.
2017-06-20 07:56:34 -06:00
Samuel Pitoiset
ea2492b62f radeonsi: set correct usage flag according to image access type
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-20 13:01:18 +02:00
Marek Olšák
58af1f6bb0 winsys/amdgpu: fix a deadlock when waiting for submission_in_progress
First this happens:

1) amdgpu_cs_flush (lock bo_fence_lock)
   -> amdgpu_add_fence_dependency
   -> os_wait_until_zero (wait for submission_in_progress) - WAITING

2) amdgpu_bo_create
   -> pb_cache_reclaim_buffer (lock pb_cache::mutex)
   -> pb_cache_is_buffer_compat
   -> amdgpu_bo_wait (lock bo_fence_lock) - WAITING

So both bo_fence_lock and pb_cache::mutex are held. amdgpu_bo_create can't
continue. amdgpu_cs_flush is waiting for the CS ioctl to finish the job,
but the CS ioctl is trying to release a buffer:

3) amdgpu_cs_submit_ib (CS thread - job entrypoint)
   -> amdgpu_cs_context_cleanup
   -> pb_reference
   -> pb_destroy
   -> amdgpu_bo_destroy_or_cache
   -> pb_cache_add_buffer (lock pb_cache::mutex) - DEADLOCK

The simple solution is not to wait for submission_in_progress, which we
need in order to create the list of dependencies for the CS ioctl. Instead
of building the list of dependencies as a direct input to the CS ioctl,
build the list of dependencies as a list of fences, and make the final list
of dependencies in the CS thread itself.

Therefore, amdgpu_cs_flush doesn't have to wait and can continue.
Then, amdgpu_bo_create can continue and return. And then amdgpu_cs_submit_ib
can continue.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101294

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-20 12:53:46 +02:00
Samuel Pitoiset
afeaa2e98a radeonsi: update all resident texture descriptors when needed
To avoid useless DCC fetches when DCC is disabled, descriptors
have to be updated in order to reflect this change. This is
quite similar to how we update descriptors of bound textures.

As a side effect, this should also prevent VM faults when
bindless textures are invalidated, because the VA in the
descriptor has to be updated accordingly as well.

I don't see any performance improvements with DOW3.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-20 10:14:55 +02:00
Samuel Pitoiset
f00e80e3f7 radeonsi: keep track of the sampler state for texture handles
Needed for updating all resident texture descriptors when
dirty_tex_counter changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-20 10:14:52 +02:00
Lionel Landwerlin
bf5ca4f0b2 i965: perf: use gen_device_info rather then brw_context
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin
6d759cbd49 intel: common: add number of thread per eu
This will be used by to normalize OA counters.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin
c77d98ef32 intel: common: express timestamps units in frequency
Rather than storing the period as a double that looses some precision.

Also fixes the Gen9LP timestamp frequency which is no 19200123 but
19200000 as pointed by Ville :

https://lists.freedesktop.org/archives/intel-gfx/2017-April/125126.html

Finally add the Cannonlake timestamp frequency.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin
e5743ee014 i965: convert MI_REPORT_PERF_COUNT to genxml
Also make it available from gen7 only to gen7+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin
a26f8d99a6 i965: perf: fix codegen with single operand equation
We did support single value operand equations, but not single variable
operand ones. In particular we were failing on "$Sampler0Bottleneck".

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Lionel Landwerlin
5f2fe9302c intel: common: add flag to identify platforms by name
The perf infrastructure needs to identify specific platforms, not just
generations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 22:11:00 +01:00
Topi Pohjolainen
b539f6958e i965/wm: Use stored hiz surface instead of creating copy
Now the last user of intel_miptree_get_aux_isl_surf() is gone.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
7e4ea22762 i965/blorp: Use hiz surface instead of creating copy
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
f60e23cb57 i965/miptree/gen7+: Use isl for hiz layouts
v2: Use better assert by checking isl_surf_get_hiz_surf()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
67b44a8423 i965/miptree: Drop BO_ALLOC_FOR_RENDER in intel_miptree_alloc_mcs()
because buffers get unconditionally initialised by cpu writing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
1a43d774b6 i965/miptree: Use isl for mcs layouts
and pass the ccs isl surface to blorp instead of creating a
copy.

v2 (Jason): Explain ccs change and use better assert checking
            isl_surf_get_mcs_surf()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
31bd461816 i965/miptree: Refactor aux surface allocation
v2 (Jason): Drop unused argument in intel_alloc_aux_buffer() and
            move assignment of "buf->surf" in intel_alloc_aux_buffer()
            into this patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
7e25410563 i965/gen6: Use isl for hiz
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:57 +03:00
Topi Pohjolainen
59e5519afa i965/miptree: Refactor isl aux usage resolver
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:56 +03:00
Topi Pohjolainen
d8a4b8bc88 i965/gen6: Use isl for stencil surfaces
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:56 +03:00
Topi Pohjolainen
0e816c9deb i965/miptree: Prepare range getter for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:56 +03:00
Topi Pohjolainen
a808eb172a i965/miptree: Prepare stencil mapping for isl based
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:56 +03:00
Topi Pohjolainen
7294cde750 i965/blorp: Prepare for isl based miptrees
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:56 +03:00
Topi Pohjolainen
3cf470f2b6 i965: Add isl based miptree creator
v2: Use new brw_bo_alloc_tiled() interface

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:57:44 +03:00
Topi Pohjolainen
5d125f999e i965/miptree: Add option to resolve offsets using isl_surf
v2 (Nanley): Add comment telling why "level -= mt->first_level"

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
71ac909137 i965: Prepare slice copy for isl based miptrees
v2 (Jason): Fix a helper variable only used for assert -
            open code instead.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
de158c1e43 i965/tex: Prepare image update for isl based miptrees
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
bb9c4113dc i965: Prepare framebuffer validator for isl based miptrees
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
c05817ffc5 i965: Prepare slice validator for isl based miptrees
v2 (Nanley): Minify depth in case of 3D surface. Also moved to
             .c file to get minify() without additional
             header inclusions

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
143e3a679a i965: Prepare image validation for isl based miptrees
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
41a7a0e548 i965: Prepare up/downsampling for isl based miptrees
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
02fa622037 i965/miptree: Add isl surface
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
5a3105fe9a i965: Add helper for converting isl tiling to bufmgr tiling
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:45 +03:00
Topi Pohjolainen
a7480d3f03 i965/miptree: Refactor mapping table alloc
v2 (Nanley): Use minify() instead of direct shift

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:41:39 +03:00
Topi Pohjolainen
335543699a i965/gen6: Declare minify(depth, level) layers for 3D stencil
Keeps following patch refactoring the table allocation
non-functional.

Suggested-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:18:53 +03:00
Topi Pohjolainen
a5e1c9f1d5 i965/gen4: Add support for single layer in alignment workaround
On gen < 6 one doesn't have level or layer specifiers available
for render and depth targets. In order to support rendering to
specific level/layer, driver needs to manually offset the surface
to the desired slice.
There are, however, alignment restrictions to respect as well and
in come cases the only option is to use temporary single slice
surface which driver copies after rendering to the full miptree.

Current alignment workaround introduces new texture images which
are added to the parent texture object. Texture validation later
on copies the additional levels back to the surface that contains
the full mipmap.
This only works for non-arrayed surfaces and driver currently
creates new arrayed images in vain - individual layers within the
newly created are still unaligned the same as before.

This patch drops this mechanism and instead attaches single
temporary slice into the render buffer. This gets immediately
copied back to the mipmapped and/or arrayed surface just after
the render is done.

Sitting on top of earlier series cleaning up the depth buffer
state, this patch additionally fixes the following piglit tests:

    arb_framebuffer_object.fbo-generatemipmap-cubemap.g965m64
    arb_texture_cube_map.copyteximage cube.g965m64
    arb_texture_cube_map.copyteximage cube.ilkm64
    arb_pixel_buffer_object.texsubimage array pbo.g965m64
    ext_framebuffer_object.fbo-cubemap.g965m64
    ext_texture_array.copyteximage 1d_array.g45m64
    ext_texture_array.copyteximage 1d_array.g965m64
    ext_texture_array.copyteximage 1d_array.ilkm64
    ext_texture_array.copyteximage 2d_array.g45m64
    ext_texture_array.copyteximage 2d_array.g965m64
    ext_texture_array.copyteximage 2d_array.ilkm64
    ext_texture_array.fbo-array.g965m64
    ext_texture_array.fbo-generatemipmap-array.g965m64
    ext_texture_array.gen-mipmap.g965m64

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:18:53 +03:00
Topi Pohjolainen
a9c59c10a5 i965/miptree: Separate src and dst slice specifiers in slice copy
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:18:53 +03:00
Topi Pohjolainen
920c8e89c5 i965/miptree: Clarify face/level/layer in slice copy
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-19 22:18:53 +03:00
Jonas Kulla
a52ee32a9a anv: Fix L3 cache programming on Bay Trail
Valid values for URBAllocation start at 32, so substract that
before programming the register.

This was missed when porting from the GL driver.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-19 12:05:52 -07:00
Marek Olšák
3fc99f1299 radeonsi: fix dumping shader descriptors into ddebug logs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:16:20 +02:00
Marek Olšák
f9dc29a9a5 radeonsi: add a workaround for inexact SNORM8 blitting again
GFX9 is affected.

We only have tests for GL_x_SNORM where x is R8, RG8, RGB8, and RGBA8.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
0f827b51c0 radeonsi/gfx9: fix TC-compatible stencil compression
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
8a264dd829 radeonsi/gfx9: fix TXF_LZ with 1D textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
353b60cab5 radeonsi/gfx9: disable sparse buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
064f07fef3 ac/sid.h: don't use parentheses in PKT3_RELEASE_MEM definition
The parses skips the line if it contains parentheses.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
ed291cea3d ac: parse EVENT_WRITE_EOP, RELEASE_MEM, WAIT_REG_MEM, NOWHERE
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:15:36 +02:00
Marek Olšák
66b6babbea st/mesa: simplify returning GL_VENDOR
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:09:52 +02:00
Marek Olšák
92b4ca4550 st/mesa: remove the "Gallium 0.4 on" prefix from GL_RENDERER
If you want to keep it for your driver, please raise your hand.
The prefix will probably have to be added into the driver instead of here.

I cringe when I look at my long renderer string:
  Gallium 0.4 on AMD Radeon R9 Fury Series (DRM 3.17.0 / 4.11.0-staging-01277-gab25a9e, LLVM 5.0.0)

I'm sincerely sorry for all apps that detect Mesa by expecting "Gallium"
in the string.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-06-19 20:09:52 +02:00
Marek Olšák
61dc2c964e st/mesa: don't update MSAA states for GL_FRAMEBUFFER_SRGB
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 20:09:52 +02:00
Kenneth Graunke
6a7c5257ca i965: Ignore anisotropic filtering in nearest mode.
This fixes both Europa Universalis IV and Stellaris rendering on i965.
This was tested on SKL.

This fix was discovered by Jakub Szuppe at Stream HPC
(https://streamhpc.com/).

bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96958
bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95530
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
2017-06-19 10:09:06 -07:00
Iago Toral Quiroga
b70d6a2de1 glsl: gl_Max{Vertex,Fragment}UniformComponents exist in all desktop GL versions
The current implementation assumed that these were replaced in GLSL >= 4.10
by gl_Max{Vertex,Fragment}UniformVectors, however this is not true: both
built-ins should be produced from GLSL 4.10 onwards.

This was raised by new CTS tests that are in development.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-19 14:43:54 +02:00
Emil Velikov
4a7222518d docs: update calendar, add news item and link release notes for 17.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-19 12:23:07 +01:00
Emil Velikov
42098bf9b2 docs: add sha256 checksums for 17.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-19 12:20:52 +01:00
Emil Velikov
b55dfb7be3 docs: add release notes for 17.1.3
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-19 12:20:51 +01:00
Nicolai Hähnle
b28938ffce st/glsl_to_tgsi: use correct writemask when converting generic intrinsics
This fixes a bug when lowering ballotARB: previously, using writemask 0xf,
emit_asm would create TGSI_OPCODE_BALLOT instructions that span two registers
to cover 4 64-bit channels. This could trample over other a neighbouring
temporary.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101360
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-19 12:07:05 +02:00
Nicolai Hähnle
25e5534734 gallium/radeon/gfx9: fix PBO texture uploads to compressed textures
st/mesa creates a surface that reinterprets the compressed blocks as
RGBA16UI or RGBA32UI. We have to adjust width0 & height0 accordingly to
avoid out-of-bounds memory accesses by CB.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-19 12:05:15 +02:00
Nicolai Hähnle
4d5bb1b987 r600: fix off-by-one in egd_tables.py
Port of the corresponding fix in sid_tables.py.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-19 12:05:12 +02:00
Nicolai Hähnle
67e49a7f65 amd/common: fix off-by-one in sid_tables.py
The very last entry in the sid_strings_offsets table ended up missing,
leading to out-of-bounds reads and potential crashes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-19 12:03:59 +02:00
Iago Toral Quiroga
b72b7c541d i965: update MaxTextureRectSize to match PRMs and comply with OpenGL 4.1+
We were exposing 4096, but we can do up to 8192 in Gen4-6 and up to
16384 in gen7+. OpenGL 4.1+ requires at least 16384.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-19 07:55:48 +02:00
Samuel Pitoiset
10d104207a mesa: add KHR_no_error support for gl*UniformHandleui64*ARB
Similar to _mesa_uniform() except that we have to call
validate_uniform_parameters() instead of validate_uniform().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-18 14:21:05 +02:00
Samuel Pitoiset
304de4edb9 mesa: add KHR_no_error support for glGetImageHandleARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-18 14:21:04 +02:00
Samuel Pitoiset
530ff887eb mesa: add KHR_no_error support for glGetTexture*HandleARB()
It would be nice to have a no_error path for
_mesa_test_texobj_completeness() because this function doesn't
only test if the texture is complete.

Anyway, that seems enough for now and a bunch of checks are
skipped with this patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-18 14:21:01 +02:00
Samuel Pitoiset
0fb2c89c71 mesa: add KHR_no_error support for glMake{Image,Texture}Handle*ResidentARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-18 14:20:59 +02:00
Samuel Pitoiset
d7bee4a022 mesa: add KHR_no_error support for glIs{Image,Texture}HandleResidentARB()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-18 14:20:57 +02:00
Samuel Pitoiset
6ff6863c32 radeonsi: reduce overhead for resident textures which need color decompression
This is done by introducing a separate list.

si_decompress_textures() is now 5x faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-18 14:10:38 +02:00
Samuel Pitoiset
06ed251c32 radeonsi: reduce overhead for resident textures which need depth decompression
This is done by introducing a separate list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-18 14:10:36 +02:00
Samuel Pitoiset
705a6a560e radeonsi: use util_dynarray_foreach for bindless resources
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-18 14:10:34 +02:00
Samuel Pitoiset
db73595018 mesa/util: add util_dynarray_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-18 14:10:32 +02:00
Samuel Pitoiset
8d9e76ce1f gallium/radeon: add a new HUD query for the number of resident handles
Useful for debugging performance issues when ARB_bindless_texture
is enabled. This query doesn't make a distinction between texture
and image handles.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-18 14:08:08 +02:00
Topi Pohjolainen
e08171ef53 i965/gen4: Refactor depth/stencil rebase
Effectively there is the same code twice, once for depth and
again for stencil.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-18 10:46:44 +03:00
Topi Pohjolainen
84b195b361 i965: Drop depth/stencil miptree pointers in alignment workaround
In brw_workaround_depthstencil_alignment() corresponding
renderbuffers are always set to refer to the same temp miptrees.
There is no need to carry them in context.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-18 10:46:44 +03:00
Topi Pohjolainen
cd0804c359 i965/gen4: Simplify depth/stencil invalidate check
There is no separate stencil on gen < 6.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-18 10:46:44 +03:00
Topi Pohjolainen
bb5d3fe96a i965/gen4: Remove redundant check for depth when rebasing stencil
In case of gen < 6 stencil (if present) is always combined with
depth. Both stencil and depth attachments point to the same
physical surface.
Alignment workaround starts by considering depth and updates
stencil accordingly. Current logic continues with stencil and
in vain considers the case where depth would refer to different
surface than stencil.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-18 10:46:44 +03:00
Topi Pohjolainen
04524ac0d4 i965/gen4: Remove non-existing stencil and hiz buffer setup
Separate stencil and hiz are only enabled for gen6+.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-18 10:46:44 +03:00
Mauro Rossi
58d337941e android: ac: add missing libdrm_amdgpu shared dependency
Fixes building errors in amd/common:

target  C: libmesa_amd_common <= external/mesa/src/amd/common/ac_gpu_info.c
...
target  C: libmesa_amd_common <= external/mesa/src/amd/common/ac_surface.c
...

external/mesa/src/amd/common/ac_gpu_info.h:31:10: fatal error: 'amdgpu.h' file not found
         ^
2 errors

Fixes: 98a2492 ("ac_surface: use radeon_info from ac_gpu_info")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-06-17 18:38:31 +01:00
Emil Velikov
68aa39d5c2 r600: include libelf headers only as needed
Headers are required only when building with OpenCL. As we're building
w/o it libelf may be missing, hence we'll error out as below:

src/gallium/drivers/r600/evergreen_compute.c:27:10:
fatal error: 'gelf.h' file not found
         ^
1 error generated.

Fixes: d96a210842 ("r600g,compute: provide local copy of functions from
ac_binary.c")
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-17 16:57:18 +01:00
Emil Velikov
1f958c1337 radeonsi: include ac_binary.h for struct ac_shader_binary
The header embeds the struct so it needs the header inclusion instead of
the dummy forward declaration.

Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Tom Stellard <tstellar@redhat.com>
Fixes: 32206c5e56 ("radeonsi: Add radeon_shader_binary member to struct
si_shader")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-17 11:38:02 +01:00
Emil Velikov
7e1c42cf89 r600, radeon: move radeon_shader_binary_{init,clean} back to radeon
Those are used by r600 and radeonsi, so moving them within the former
was a bad idea.

Fixes: d96a210842 ("r600g,compute: provide local copy of functions
from ac_binary.c")
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: Aaron Watry <awatry@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-17 11:37:58 +01:00
Emil Velikov
84bf7e5ad6 ac: resolve conflicts introduced with "ac: remove amdgpu.h dependency"
The commit did not add the relevant includes - in particular
stdint.h and stdbool.h for the respective standard types.

At the same time, the amdgpu_device_handle typedef redeclaration was
off.

Fixes: 81945ded0d ("ac: remove amdgpu.h dependency")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101471
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Gregor Münch <gr.muench@gmail.com>
Reported-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reported-by: Gregor Münch <gr.muench@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-17 11:37:51 +01:00
Topi Pohjolainen
6967285981 i965/gen4: Set depth offset when there is stencil attachment only
Current version fails to set depthstencil.depth_offset when there
is only stencil attachment (it does set the intra tile offsets
though). Fixes piglits:

g45,g965,ilk:   depthstencil-render-miplevels 1024 s=z24_s8
g45,ilk:        depthstencil-render-miplevels 273 s=z24_s8

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-17 06:38:56 +03:00
Topi Pohjolainen
a8e89cd539 i965/gen6: Remove dead code in hiz surface setup
In intel_hiz_miptree_buf_create() the miptree is unconditionally
created with MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-17 06:38:56 +03:00
Topi Pohjolainen
0d1af164e1 intel/isl/gen6: Allow arrayed stencil
Nothing prevents arrayed stencil surfaces even though hardware
doesn't support mipmapping.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-17 06:38:56 +03:00
Brian Paul
e3f5b8ac16 svga: add new num-failed-allocations HUD query
This counter is incremented if we fail to allocate memory for
vertex/index/const buffers, textures, etc.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-16 17:04:08 -06:00
Brian Paul
b27281c110 gallium/hud: support GALLIUM_HUD_DUMP_DIR feature on Windows
Use a dummy implementation of the access() function.  Use \ path separator.
Add a few comments.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-16 17:04:02 -06:00
Brian Paul
d6cb912d65 svga: add a few minor comments
Trivial.
2017-06-16 17:03:01 -06:00
Brian Paul
15f4c3ada4 mesa: whitespace fixes in enable.c
Remove trailing whitespace, replace tabs w/ spaces, etc.  Trivial.
2017-06-16 17:03:01 -06:00
Rafael Antognolli
c2b5a26dc2 i965: Convert SF_STATE to genxml.
This patch finishes the work done by Ken of converting SF_STATE to genxml, and
merges it with gen6+ code for emitting that state.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli
3a767f8b06 genxml: The viewport state offset is actually an address.
This fixes code generation on gen45.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli
ad109c16c2 genxml: Rename fields to match gen6+.
"Anti-aliasing Enable" to "Anti-Aliasing Enable".

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli
1b42cd52a2 genxml: Rename SF_STATE field to match gen6+.
Rename "Use Point Width State" to "Point Width Source". It accepts the same
values and has the same meaning as gen6+, so lets keep them with the same name
to simplify the code.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Rafael Antognolli
bd40c71132 i965: aa_line_distance_mode should be before the padding.
It seems that it was never set correctly.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-16 15:01:16 -07:00
Tim Rowley
a6237e4b7f swr/rast: Fix read-back of viewport array index
Binner/clipper read viewport array index from the vertex header as needed.
Move viewport state to BACKEND_STATE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
9b448da60f swr/rast: Refactor includes to limit simdintrin.h usage
Reduces the files rebuilt after modifying simdintrin.h from
84 to 64.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
08a466aec0 swr/rast: Fix read-back of render target array index
The last FE stage can emit render target array index. Currently we only
check to see if GS is emitting it. Moved the state to BACKEND_STATE and
plumbed the driver to set it.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
17cdd1e796 swr/rast: Adjust cast for gcc warning
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
bea00a7b6e swr/rast: Don't transition hottile resolved->dirty during store tiles
Fixes crash when dumping render targets and RT surface has been deleted.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
5c08bfbd17 swr/rast: gen_llvm_types.py support for SIMD256/SIMD512
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
21baadfe58 swr/rast: Properly size GS stage scratch space
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
3695c8ec1e swr/rast: Fix early z / query interaction
For certain cases, we perform early z for optimization. The GL_SAMPLES_PASSED
query was providing erroneous results because we were counting the number
of samples passed before the fragment shader, which did not work if the
fragment shader contained a discard.

Account properly for discard and early z, by anding the zpass mask with
the post fragment shader active mask, after the fragment shader.

Fixes the following piglit tests:
    - occlusion-query-discard
    - occlusion_query_meta_fragments

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
b7eb86c617 swr/rast: Share vertex memory between VS input/output
Removes large simdvertex stack allocation.

Vertex shader must ensure reads happen before writes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
7f3be3f0b8 swr/rast: Add support for dynamic vertex size for VS output
Add support for dynamic vertex size for the vertex shader output.

Add new state in SWR_FRONTEND_STATE to specify the size.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
8e5d11cd7b swr/rast: SIMD16 FE - improve calcDeterminantIntVertical
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
01eca81cd4 swr/rast: Add support to PA for variable sized vertices
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
b10cdb217a swr/rast: Rework attribute layout
Move fixed attributes to the top and pack single component SGVs.
WIP to support dynamically allocated vertex size.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
36ac8ba511 swr/rast: Remove explicit primitive id slot in the vertex layout
- Remove any special casing in the PS stage when primitive ID is input.
  Treat as a normal attribute that must be set up properly in the FE linkage.
- Remove primitive id from the PS_CONTEXT and TRI_FLAGS

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
8716e0d8b4 swr/rast: Fix invalid 16-bit format traits for A1R5G5B5
Correctly handle formats of <= 16 bits where the component bits don't
add up to the pixel size.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Tim Rowley
a25093de71 swr/rast: Implement JIT shader caching to disk
Disabled by default; currently doesn't cache shaders (fs,gs,vs).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-06-16 16:20:16 -05:00
Brian Paul
1c33dc77f7 gallium/docs: improve docs for SAMPLE_POS, SAMPLE_INFO, TXQS, MSAA semantics
For the SAMPLE_POS and SAMPLE_INFO opcodes, clarify resource vs. render
target queries, range of postion values, swizzling, etc.  We basically
follow the DX10.1 conventions.

For the TXQS opcode and TGSI_SEMANTIC_SAMPLEID, clarify return value
and type.

For the TGSI_SEMANTIC_SAMPLEPOS system value, clarify the range of
positions returned.

v2: use 'undef' for unused vector components.  Use (0.5, 0.5, undef, undef)
for sample pos when MSAA not applicable.

v3: Add note that OPCODE_SAMPLE_INFO, OPCODE_SAMPLE_POS are not used yet
and the information is subject to change.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-16 14:07:31 -06:00
Brian Paul
005c978c5a svga: add some missing SVGA_STATS_* enum values, prefix strings
To fix the build when VMX86_STATS is defined.
Also, some minor whitespace changes to match upstream code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-16 14:06:53 -06:00
Alex Deucher
5c603b902b radeonsi: add new polaris12 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
2017-06-16 16:03:16 -04:00
Bruce Cherniak
80b587ba27 swr: Don't crash when encountering a VBO with stride = 0.
The swr driver uses vertex_buffer->stride to determine the number
of elements in a VBO. A recent change to the state-tracker made it
possible for VBO's with stride=0. This resulted in a divide by zero
crash in the driver. The solution is to use the pre-calculated vertex
element stream_pitch in this case.

This patch fixes the crash in a number of piglit and VTK tests introduced
by 17f776c27b.

There are several VTK tests that still crash and need proper handling of
vertex_buffer_index.  This will come in a follow-on patch.

v2: Correctly update all parameters for VBO constants (stride = 0).
    Also fixes the remaining crashes/regressions that v1 did
    not address, without touching vertex_buffer_index.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-06-16 13:45:24 -05:00
Anuj Phogat
c07271fef0 intel/isl: Add the maximum surface size limit
V2: Use 2^31 bytes (2GB) surface size limit on pre-gen9 and
    2^38 bytes for gen9+.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Anuj Phogat
7022978237 intel/isl: Use uint64_t to store total surface size
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-06-16 09:05:05 -07:00
Chris Wilson
05d5caffc4 i965: Mark freshly allocate bo as idle
When created, buffers are idle, so mark them as such to save an early
ioctl or mistakenly assuming the fresh buffer is busy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-16 16:20:28 +01:00
Christian Gmeiner
82db591155 etnaviv: add rs-operations sw query
It could be useful to get the number of emited resolve operations when
doing driver optimizations.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-16 15:28:12 +02:00
Lucas Stach
5065549e2a etnaviv: advertise correct max LOD bias
The maximum LOD bias supported is the same as the max texture level
supported.

Fixes piglit: ext_texture_lod_bias

Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Lucas Stach
8644b59b5d etnaviv: mask correct channel for RB swapped rendertargets
Now that we support RB swapped targets by using a shader variant, we
must derive the color mask from both the blend state and the bound
framebuffer.

Fixes piglit: fbo-colormask-formats

Fixes: 7f62ffb68a ("etnaviv: add support for rb swap")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Lucas Stach
d6aa2ba2b2 etnaviv: replace translate_clear_color with util_pack_color
This replaces the open coded etnaviv version of the color pack with the
common util_pack_color.

Fixes piglits:
arb_color_buffer_float-clear
fcc-front-buffer-distraction
fbo-clearmipmap

Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Lucas Stach
6633880e7e etnaviv: remove bogus assert
etna_resource_copy_region handles resources with multiple samples
by falling back to the software path. There is no need to kill the
application there.

Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Lucas Stach
ff490eb8fd etnaviv: use padded width/height for resource copies
When copying a resource fully we can just blit the whole level. This allows
to use the RS even for level sizes not aligned to the RS min alignment. This
is especially useful, as etna_copy_resource is part of the software fallback
paths (used in etna_transfer), that are used for doing unaligned copies.

Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Lucas Stach
2a6183d416 etnaviv: don't try RS blit if blit region is unaligned
If the blit region is not aligned to the RS min alignment don't try
to execute the blit, but fall back to the software path.

Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-16 15:26:23 +02:00
Emil Velikov
d5199cdd7a Revert "amd/common: add missing libdrm include path"
This reverts commit 44b29dd7b6.

Should no longer be required as of last patch.

Cc: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-16 12:41:44 +01:00
Emil Velikov
81945ded0d ac: remove amdgpu.h dependency
Add a couple of forward declarations and drop the amdgpu.h requirement.

With this we can build the r300 and r600 drivers without the need for
amdgpu.

v2:
 - Add amdgpu.h include in the C file (Marek)
 - Add a comment about pre C11 typedef redeclaration warning (Eric)

Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101189
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-16 12:41:44 +01:00
Jan Vesely
d96a210842 r600g,compute: provide local copy of functions from ac_binary.c
This is a verbatim copy of the code. The functions can be cleaned up since
r600 does not use all the stuff that gcn does.
The symbol names have been changed since we still use ac_binary.h header
(for struct definition)

v2: Add ifdef guard around r600_binary_clean call (Aaron)
    Remove stray comment

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-16 12:41:44 +01:00
Jan Vesely
d41b7b0104 r600: android: amdgpu_common is only required when building OpenCL
v2: split off Android changes

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-16 12:41:44 +01:00
Eric Engestrom
311c091658 egl/display: make platform detection thread-safe
Imagine there are 2 threads that both call _eglGetNativePlatform()
simultaneously:
- thread 1 completes the first "if (native_platform ==
  _EGL_INVALID_PLATFORM)" check and is preempted to do something else
- thread 2 executes the whole function, does "native_platform =
  _EGL_NATIVE_PLATFORM" and just before returning it's preempted
- thread 1 wakes up and calls _eglGetNativePlatformFromEnv() which
  returns _EGL_INVALID_PLATFORM because no env vars are set, updates
  native_platform and then gets preempted again
- thread 2 wakes up and returns wrong _EGL_INVALID_PLATFORM

Solve this by doing the detection in a local var and only overwriting
the global one at the end, if no other thread has updated it since.

This means the platform detected in the thread might not be the platform
returned by the function, but this is a different issue that will need
to be discussed when this becomes possible.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-06-16 11:02:06 +01:00
Eric Engestrom
4ca9ae587c egl/display: only detect the platform once
My refactor missed the fact that `native_platform` is static.
Add the proper guard around the detection code, as it might not be
necessary, and only print the debug message when a detection was
actually performed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101252
Fixes: 7adb9b0948 ("egl/display: remove unnecessary code and
                              make it easier to read")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-06-16 11:02:05 +01:00
Thomas Hellstrom
9d81ab7376 svga: Relax the format checks for copy_region_vgpu10 somewhat
The new generic checks were actually more restrictive than the previous svga-
specific tests and not vice versa. So bypass the common format checks for
copy_region_vgpu10.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
a37eede540 svga: Fix incorrect format conversion blit destination
The blit.dst.resource member that was used as destination was
modified earlier in the function, effectively making us try to blit
the content onto itself. Fix this and also add a debug printout when the
format conversion blits fail.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
5732ac3ecc svga: Fix srgb copy_region regression
This fixes a tf2 srgb copy_region regression from
"svga: Rework the blit and resource_copy_region functionality v3"

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
14f888a2ba svga: Prefer accelerated blits over cpu copy region
This reduces the number of cpu copy_region fallbacks on a Nvidia system
running the piglit command

./publish/bin/piglit run  -1 -t copy -t blit tests/quick

from 64789 to 780

Previously this has caused a regression in piglit test
spec@!opengl 1.0@gl-1.0-scissor-copypixels, but I'm currently not able to
reproduce that regression.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
4c3e8f141b svga: Support accelerated conditional blitting
The blitter has functions to save and restore the conditional rendering state,
but we currently don't save the needed info.

Since also the copy_region_vgpu10 path supports conditional blitting,
we instead use the same function as the clearing routines and move
that function to svga_pipe_query.c

Note that we still haven't implemented conditional blitting with
the software fallbacks.

Fixes piglit nv_conditional_render::copyteximage

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
71f857d6ab svga: Use utility functions to help determine whether we can use copy_region
It seems like the SVGA tests are in general more stringent than the utility
tests, but they also miss some blitter features like filters and window
rectangles, and if new blitter features are added in the future, it might
be possible that we forget adding tests for those.

So in addition to the SVGA tests, use the utility tests to restrict the
situations where we can use copy_region.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-16 08:40:26 +02:00
Thomas Hellstrom
f4c2d4bd4a svga: Rework the blit and resource_copy_region functionality v3
This work was initially trigged by the fact that imported surfaces may
be backed by other SVGA3D formats than the default. Therefore some fixes were
needed to avoid using the copy_region_vgpu10() functionality for incompatible
SVGA3D formats where the pipe formats were OK. This situation happens when
using dri3.

Also in some situations, for example where a R8G8_UNORM surface is backed by
an SVGA3D_NV12 format, we can't use the copy_region functionality at all and
thus need to fall back to the quad blitter also for the resource_copy_region
function. This situation doesn't happen currently, but will if we start using
video textures.

The patch makes the blit- and copy_region paths similar and the decision whether
to use a certain gpu command should now be easy to locate. Probably the
resource_copy_region path will suffer from a minor additional cpu overhead,
but on the other hand there are more cases now that we accelerate, since
we try harder before falling back to cpu copies / blits.

v2: Addressed review comments and fixed up piglit failures by sometimes
preferring cpu_copy_region() over blit().
v3: Removed a stray test statement. Updated commit message.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-16 08:40:26 +02:00
Kenneth Graunke
ad412d6319 i965: Improve conditional rendering in fallback paths.
We need to fall back in a couple of cases:
- Sandybridge (it just doesn't do this in hardware)
- Occlusion queries on Gen7-7.5 with command parser version < 2
- Transform feedback overflow queries on Gen7, or on Gen7.5 with
  command parser version < 7

In these cases, we printed a perf_debug message and fell back to
_mesa_check_conditional_render(), which stalls until the full
query result is available.  Additionally, the code to handle this
was a bit of a mess.

We can do better by using our normal conditional rendering code,
and setting a new state, BRW_PREDICATE_STATE_STALL_FOR_QUERY, when
we would have set BRW_PREDICATE_STATE_USE_BIT.  Only if that state
is set do we perf_debug and potentially stall.  This means we avoid
stalls when we have a partial query result (i.e. we know it's > 0,
but don't have the full value).  The perf_debug should trigger less
often as well.

Still, this is primarily intended as a cleanup.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-15 22:42:50 -07:00
Emil Velikov
1b03323e17 configure.ac: remove manual AC_SUBST for pthread-stubs
Unneeded, since the PKG_CHECK_MODULES macro already does the
substitution of the package Cflags/Libs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-15 23:24:26 +01:00
Emil Velikov
e5aa806e5f configure.ac: add -pthread to PTHREAD_LIBS
As described inline - follow what's written in the manual and what works
for all platforms that Mesa supports.

We want to untangle things leaving only -pthread, yet that has a
potential of causing regressions. Thus we'll do it as a follow-up patch.

As a nice side-effect this resolves issues, where the system lacks
libpthread.so, yet the linker does not warn about it and we and up with
unresolved symbols.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101071
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-15 23:24:26 +01:00
Timothy Arceri
fcbb93e860 mesa: stop assigning unused storage for non-bindless opaque types
The storage was once used by get_sampler_uniform_value() but that
was fixed long ago to use the uniform storage assigned by the
linker.

By not assigning storage for images/samplers the constant buffer
for gallium drivers will be reduced which could result in small
perf improvements.

V2: rebase on ARB_bindless_texture

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-16 08:09:03 +10:00
Robert Foss
a4d3371176 egl/android: Fix typ-o
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-06-15 22:35:10 +01:00
Brian Paul
c8f344ed2d draw: check for line_width != 1.0f in validate_pipeline()
We shouldn't use the wide line stage if the line width is 1.
This check isn't strictly needed because all drivers are (now)
specifying a line wide threshold of at least 1.0 pixels, but
let's play it safe.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-15 13:53:00 -06:00
Brian Paul
c2b92dada0 svga: clamp device line width to at least 1 to fix HWv8 line stippling
The line stipple fallback code for virtual HW version 8 didn't work.

With HW version 8, we were getting zero when querying the max line
widths (AA and non-AA).  This means we were setting the draw module's
wide line threshold to zero.  This caused the wide line stage to always
get enabled.  That caused the line stipple module to fall because the
wide line stage was clobbering the rasterization state with a state
object setting the line stipple pattern to 0xffff.

Now the wide_lines variable in draw's validate_pipeline() will not
be incorrectly set.

Also improve debug output.

BTW, also this fixes several other piglit tests: polygon-mode,
primitive- restart-draw-mode, and line-flat-clip-color since they
all use the draw module fallback.

See VMware bug 1895811.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-15 13:53:00 -06:00
Brian Paul
c9f4e069ba draw: whitespace and formatting fixes
Trivial.
2017-06-15 13:53:00 -06:00
Brian Paul
c2e00c29b7 automake: increase the MESA_GIT_SHA1 hash id length from 7 to 10 digits
The SCons build has been using 10 digits of the git hash id for the
MESA_GIT_SHA1 string in git_sha1.h for about a year now.  I bumped it
up after running into a case where a 7-digit hash ID was ambiguous.

This patch makes the same change for the autotools build.

The command "git log | grep "^commit" | cut -b 8-14 | sort | uniq -d"
shows there are currently 17 cases where 7 digits of hash id are
ambiguous on master (probably quite a few more if we'd consider other
branches).

Instead of using "git log -n 1 --oneline" use
"git rev-parse --short=10 HEAD" to get the HEAD hash id.

v2: use printf instead of sed, per Eric's suggestion.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-15 13:53:00 -06:00
Eric Anholt
7029ec05e2 gallium: Add renderonly-based support for pl111+vc4.
This follows the model of imx (display) and etnaviv (render): pl111 is a
display-only device, so when asked to do GL for it, we see if we have a
vc4 renderer, make the vc4 screen, and have vc4 call back to pl111 to do
scanout allocations.

The difference from etnaviv is that we share the same BO between vc4 and
pl111, rather than having a vc4 bo and a pl11 bo and copies between the
two.  The only mismatch between their requirements is that vc4 requires
4-pixel (at 32bpp) stride alignment, while pl111 requires that stride
match width.  The kernel will reject any modesets to an incorrect stride,
so the 3D driver doesn't need to worry about that.

v2: Rebase on Android rework, drop unused include.
v3: Fix another Android bug, from Rob Herring's build-testing.

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-15 11:41:22 -07:00
Eric Anholt
7a17191305 etnaviv: Only use renderonly_get_handle for GEM handles.
Note that for requests for Prime FDs or flink names, we return handles to
the etanviv BO, not the scanout BO.  This is at least better than previous
behavior of returning GEM handles for a request for an FD or flink name.

And add an assert that renderonly_get_handle is only used for getting the
GEM handle.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-06-15 11:41:22 -07:00
Mauro Rossi
d5a9608076 android: r600/eg: add support for tracing IBs after a hang.
The rules to generate egd_tables.h are added in Android makefile

Fixes: f42fb00 "r600/eg: add support for tracing IBs after a hang."
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-15 15:46:44 +01:00
Mauro Rossi
d5523d912c svga: fix git_sha1.h include path in Android.mk (v3)
Adds libmesa_git_sha1 static (dummy) library to generate git_sha1.h
with some polishing to header dependency on .git/HEAD and scripted rules.

The now redundant generation rules are removed from Android.gen.mk
libmesa_git_sha1 whole static depedency is added to libmesa_pipe_svga,
libmesa_dricore and libmesa_st_mesa modules

Fixes the following building error:

external/mesa/src/gallium/drivers/svga/svga_screen.c:26:10:
fatal error: 'git_sha1.h' file not found
         ^
1 error generated.

Fixes: 1ce3a27 ("svga: Add the ability to log messages to
vmware.log on the host.")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-15 15:46:44 +01:00
Andres Gomez
5d87667fed bin/get-fixes-pick-list.sh: better identify multiple "fixes:" tags
We were not considering as multiple fixes lines with:
Fixes: $sha_1, Fixes: $sha_2

Now, we split the lines so we will consider them individually, as in:
Fixes: $sha_1,
Fixes: $sha_2

Additionally, we try to get the SHA from split lines so:
Fixes:
$sha_1

Will be considered as:
Fixes: $sha_1

v2:
 - Treat empty spaces earlier in fix lines (Emil)
 - Fold 2 lines into one to gather fix commit ids (Emil)

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
2017-06-15 15:53:21 +03:00
Andres Gomez
f1590363c9 bin/get-fixes-pick-list.sh: parse just the commit message
We were parsing the whole diff, although the candidates were
identified only by the commit message.

Now, we only use the commit message for parsing.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
2017-06-15 15:53:21 +03:00
Samuel Pitoiset
e8df89d2c5 gallium/radeon: fix initialization of new resource bindless fields
r600_resource objects are not calloc'd.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-15 11:56:21 +02:00
Lucas Stach
4026744fcb gbm: implement FD import with modifier
This implements a way to import FDs with modifiers on plain GBM devices,
without the need to go through EGL. This is mostly to the benefit of
gbm_gralloc, which can keep its dependencies low.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-06-15 10:43:36 +01:00
Lucas Stach
71b78b6b0c gbm: add API to to import FD with modifier
This allows to import an FD with an explicit modifier passed through
userspace protocols.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-06-15 10:43:23 +01:00
Emil Velikov
18d4a6f964 i965: gen4_blorp_exec.h to the sources list
We tend to use the sources, as opposed to EXTRA_DIST to include the
headers.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-06-15 10:29:47 +01:00
Michel Dänzer
176e761513 gallium/util: Break recursion in pipe_resource_reference
It calling itself recursively prevented it from being inlined, resulting
in a copy being generated in every compilation unit referencing it. This
bloated the text segment of the Gallium mega-driver *_dri.so by ~4%,
and might also have impacted performance.

Fixes: ecd6fce261 ("mesa/st: support lowering multi-planar YUV")
v2:
* Add comment above pipe_resource_next_reference [Samuel Pitoiset]
v3:
* Use loop to unreference the full chain of resources referenced via
  the next members [Timothy Arceri]
v4:
* Stop chasing ->next chain at the first sub-resource which isn't
  destroyed [Nicolai Hähnle]

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-15 11:24:59 +09:00
Samuel Pitoiset
1c00af4264 mesa: fix 'make check' by moving bindless functions at the right place
Fixes: 5f249b9f05 ("mapi: add GL_ARB_bindless_texture entry points")
Reported-by: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-15 10:44:38 +09:00
Jason Ekstrand
1d132712fe i965/miptree: Use the new simple alloc_tiled for CCS buffers
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
21d83f54b3 i965/bufmgr: Add a new, simpler, bo_alloc_tiled
ISL already has all of the complexity required to figure out the correct
surface pitch and size taking tile alignment into account.  When we get
a surface out of ISL, the pitch and size are already correct and using
brw_bo_alloc_tiled_2d doesn't actually gain us anything other than extra
asserts we have to do in order to ensure that the bufmgr code and ISL
agree.  This new helper doesn't try to be smart but just allocates the
BO you ask for and sets up the tiling.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
6ee0530c35 i965/bufmgr: Rename bo_alloc_tiled to bo_alloc_tiled_2d
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
862493f7cb i965: Use blorp for depth/stencil clears on gen6+
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
f762962f7f i965: Set step_rate = 0 for interleaved vertex buffers
Before, we weren't setting step rate so we got whatever old value
happened to be lying around.  This can lead to some interesting
rendering errors.  In particular, if you run the OpenGL ES CTS with
dEQP-GLES3.functional.instanced.types.mat2x4 immediately followed by one
of the dEQP-GLES3.functional.transform_feedback.* tests, the transform
feedback test gets stale instancing data from the other test and fails.
The only thing that is causing this to not be a problem today is that we
use meta for clears and meta is setting up vertex buffers via the VBO or
non-interleaved path and setting step_rate to 0 for us.  When blorp
depth/stencil clears are enabled, meta is no longer sitting between the
two tests and the stale data starts causing noticeable problems.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
b3569e7445 i965: Disable the interleaved vertex optimization when instancing
Instance divisor is a property of the vertex buffer and not the vertex
element so if we ever see anything other than 0, bail.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
7175561598 intel/blorp: Work around Sandy Bridge occlusion query issue
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
39a13c08dc i965/blorp: Set no_depth_or_stencil correctly
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
b14852997a i965: Remove some unneeded fields from brw_context
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
ea225d4da4 i965: Remove some of the remnants of meta
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
96f9d4de7d intel/isl: Properly set SeparateStencilBufferEnable on gen5-6
On gen5-6, SeparateStencilBufferEnable and HierarchicalDepthBufferEnable
come hand in hand and we have to set either both or neither.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
ee0e29dd02 i965/miptree: Choose the stencil layout in miptree_create_layout
This ensures that we get the correct layout for all stencil buffers, not
just those which are created as separate stencil for a depth buffer.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
6f6aa0f462 mesa: Add a BUFFER_BITS mask for depth+stencil
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Jason Ekstrand
83ab6327c1 i965/blorp: Set aux_usage to NONE for miplevels without HiZ
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-06-14 18:15:05 -07:00
Aaron Watry
e4d06e4c53 radeon/winsys: Limit max allocation size to 70% of VRAM
The CL CTS queries the max allocation size, and then attempts to
allocate buffers of that size. If not enough contiguous RAM/VRAM is
available, this causes errors in the radeon kernel module due to
inability to allocate the required memory.

It's a bit of a hack, but experimentally on my system, I can use ~3/4
of the card's VRAM for a single global/constant buffer allocation given
current GUI/compositor use.

For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of:
Global memory size                              2143076352 (1.996GiB)
Max memory allocation                           1500153446 (1.397GiB)
Max constant buffer size                        1500153446 (1.397GiB)

To:
Global memory size                              2143076352 (1.996GiB)
Max memory allocation                           751619276 (716MiB)
Max constant buffer size                        751619276 (716MiB)

Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size,
       OpenCL CTS test/conformance/api/min_max_constant_buffer_size

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 19:38:55 -05:00
Kenneth Graunke
b6d56c747c i965: Use a line end cap width of 0.5 unless smooth lines enabled.
This updates the Gen4-5 code to use a line end cap width of 0.5
for non-smooth lines, and 1.0 for smooth lines - which is what we
do on Gen6+.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Kenneth Graunke
6563d5287b i965: Use brw_get_line_width() in Gen4-5 SF_STATE code.
This unifies the Gen4-5 and Gen6+ line width calculations.

I believe it also fixes a bug - we weren't rounding the line width
to the nearest integer.  The GL 4.5 (and GL 2.1) specs "Wide Lines"
section says:

"The actual width of non-antialiased lines is determined by rounding
 the supplied width to the nearest integer, then clamping it to the
 implementation-dependent maximum non-antialiased line width."

We don't need to care about _NEW_MULTISAMPLE here because multisampling
doesn't exist on Gen4-5, so the state shouldn't change.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Kenneth Graunke
af373ea4a2 genxml: Fix Gen4-5 SF_STATE "Line Width" fixed point type.
It's a U3.1.  It became a U3.7 on Sandybridge.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Kenneth Graunke
3793410369 i965: Stop using BRW_RASTRULE_LOWER_RIGHT on Gen4-5.
This effectively reverts Robert Ellison's 2009 commit
cc8afbd386.

I'm not seeing any GL spec text indicating that UPPER won't work.
On Gen6+, this bit moved to 3DSTATE_WM as a single bit, controlling
UPPER_LEFT vs. UPPER_RIGHT.  There is no way to request LOWER_RIGHT,
so UPPER_RIGHT is the best you can do.

In the G45 docs, it's marked as "Reserved" as well, but we just
decided to use it anyway.

This patch unifies the behavior between Gen4-5 and Gen6+.

Note that this is separate from point sprite texcoord behavior.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Kenneth Graunke
6d4e031d9a i965: When gl_PointSize is unwritten, default to 1.0 on Gen4-5.
Modern GL specifications say that the point size should be 1.0 when
gl_PointSize is unwritten and the last enabled stage is a geometry
or tessellation shader.  If it's a vertex shader, though, both the
GL specs and ES 3.0 spec say that it's undefined - so since Gen4-5
only support vertex shaders, there's no actual requirement to do this.

Since there is a cost associated (an extra dirty bit, which may cause
SF_STATE to be emitted more often), it may not be a good idea.

The real benefit is that it makes all generations behave identically.
And that seems somewhat nice...

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Kenneth Graunke
3d34e27522 i965: Make Gen4-5 SF_STATE use the point size calculations from Gen6+.
Apparently, Nanhai made the Gen4-5 point size calculations round to the
nearest integer in commit 8d5231a358,
"according to spec".  When Eric first ported the driver to Sandybridge,
he did not implement this rounding.

In the GL 2.1 and 3.0 specs "Basic Point Rasterization" section, it does
say "If antialiasing and point sprites are disabled, the actual width is
determined by rounding the supplied width to the nearest integer, then
clamping it to the implementation-dependent maximum non-antialised point
width."

In contrast, GL 3.1 and later do not appear to contain this rounding.

It might be reasonable to round, given that we only implement GL 2.1.
Of course, if we were to do that, we should actually implement the AA
vs. non-AA distinction.  Brian added an XXX comment reminding us to fix
this 10 years ago, but it never happened.

I think a better plan is to follow the newer, unrounded behavior.  This
is what we do on Gen6+ and it passes all the relevant conformance tests.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-14 15:56:21 -07:00
Jason Ekstrand
d9261275cc i965: Do an end-of-pipe sync after flushes
According to the docs, a simple CS stall is insufficient to ensure that
the memory from the flush is visible and an end-of-pipe sync is needed.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:42 -07:00
Jason Ekstrand
314ec7b46f i965/blorp: Do an end-of-pipe sync around CCS ops
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:40 -07:00
Jason Ekstrand
96e7b7ac54 i965: Do an end-of-pipe sync prior to STATE_BASE_ADDRESS
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:39 -07:00
Topi Pohjolainen
7b607aae3f i965: Add an end-of-pipe sync helper
v2 (Jason Ekstrand):
 - Take a flags parameter to control the flushes
 - Refactoring

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:22 -07:00
Jason Ekstrand
b771d9a136 i965: Unify the two emit_pipe_control functions
These two functions contain almost identical logic except for one SNB
workaround required for render target cache flushes.  They may as well
call into the same code so we only have to handle the work-arounds in
one place.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:21 -07:00
Jason Ekstrand
a8ea68bc93 i965: Take a uint64_t immediate in emit_pipe_control_write
It's a 64-bit value.  Splitting it up just makes the function arguments
awkward.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:19 -07:00
Jason Ekstrand
86da08367b i965: Flush around state base address
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-14 15:11:06 -07:00
Kenneth Graunke
244c2a5d2c i965: Print "force dual color blending" in FS recompile debug output.
I forgot to add this when introducing the new key field.  It doesn't
happen often - just with the Unigine workarounds.  But we may as well
have it, so we get an accurate picture of why recompiles happen.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-06-14 14:30:11 -07:00
Eric Le Bihan
2154defcd6 Fix khrplatform.h not installed if EGL is disabled.
KHR/khrplatform.h is required by the EGL, GLES and VG headers, but is
only installed if Mesa3d is compiled with EGL support.

This patch installs this header file unconditionally.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77240
Signed-off-by: Eric Le Bihan <eric.le.bihan.dev@free.fr>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-14 16:55:13 +01:00
Ville Syrjälä
c1eedb43f3 i915: Fix wpos_tex vs. -1 comparison
wpos_tex used to be a GLuint so assigning -1 to it and
later comparing with -1 worked correctly, but commit
c349031c27 ("i915: Fix texcoord vs. varying collision in
fragment programs") changed wpos_tex to uint8_t and hence
broke the comparison. To fix this define a more explicit
invalid value for wpos_tex.

gcc warns us:
i915_fragprog.c:1255:57: warning: comparison is always true due to limited range of data type [-Wtype-limits]
    if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
                                                         ^

And clang says:
i915_fragprog.c:1255:57: warning: comparison of constant -1 with expression of type 'uint8_t' (aka 'unsigned char') is always true [-Wtautological-constant-out-of-range-compare]
   if (inputsRead & VARYING_BITS_TEX_ANY || p->wpos_tex != -1) {
                                            ~~~~~~~~~~~ ^  ~~

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Fixes: c349031c27 ("i915: Fix texcoord vs. varying collision in fragment programs")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-14 18:22:52 +03:00
Samuel Pitoiset
5f8b654b47 tgsi/scan: add missing 'static' to tgsi_is_bindless_image_file()
This should fix compilation errors in some situations.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101418
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 15:30:39 +02:00
Chuck Atkins
ad69b037b1 configure.ac: Reduce zlib requirement from 1.2.8 to 1.2.3.
Testing with zlib versions 1.2.{3,4,5,6,7,8} showed no difference in
functionality, correctness, or zlib API usage and 1.2.3 is the oldest
version available in still actively deployed production Linux
distributions (RHEL/CentOS 6 and SuSE 11).

Build 17.1.1 against the system supplied zlib-devel packages for 1.2.3
in EL6 and 1.2.7 on EL7. I then swapped out the zlib version at runtime
via LD_LIBRARY_PATH with ones build from the release tarballs from
zlib.net

Testwise - I ran the piglit shader profile with --quick addded to the
tests since I figured that would exercise the shader cache, which would
in turn use zlib.

Signed-off-by: Chuck Atkins <chuck.atkins@kitware.com>
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
[Emil Velikov: add hunk about version/piglit testing]
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-14 12:03:22 +01:00
Samuel Pitoiset
65d1e4d1eb radeonsi: enable ARB_bindless_texture
This has only been tested on RX480.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
285ec4463b radeonsi: add support for loading bindless images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
950b5ffa31 radeonsi: add support for loading bindless samplers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
0c2834c5b2 radeonsi: invalidate buffers which are made resident if needed
When a buffer becomes resident, check if it has been invalidated,
if so update the descriptor and the dirty flag.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
811756dfd0 radeonsi: upload new descriptors when resident buffers are invalidated
When texture buffers are invalidated the addr in the resident
descriptor has to be updated but we can't create a new descriptor
because the resident handle has to be the same.

Instead, use the WRITE_DATA packet which allows to update memory
directly but graphics/compute have to be idle in case the GPU is
reading the descriptor.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
48fe8a6210 radeonsi: only decompress resident textures/images when used
When the current bound shaders don't use any bindless textures
or images, it's useless to decompress the resident resources.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
2c3a7d5840 radeonsi: track use of bindless samplers/images from tgsi_shader_info
This adds some new helper functions to know if the current draw
call (or dispatch compute) is using bindless samplers/images,
based on TGSI analysis.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
e1813a8635 radeonsi: decompress resident textures/images before graphics/compute
Similar to the existing decompression code path except that it
loops over the list of resident textures/images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
d7e1a66bb5 radeonsi: decompress DCC for resident textures/images
Analogous to bound textures/images. We should also update the
resident descriptors and disable COMPRESSION_EN for avoiding
useless DCC fetches, but I postpone this optimization for a
separate series.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
a45e198e2d radeonsi: only add descriptors in presence of resident handles
This won't help much except for applications that use a ton
of resident handles. Though, this will reduce the winsys
overhead a little bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
333c8f65cf radeonsi: add all resident buffers to the current CS
Resident buffers have to be added to every new command stream.
Though, this could be slightly improved when current shaders
don't use any bindless textures/images but usually applications
tend to use bindless for almost every draw call, and the winsys
thread might help when buffers are added early.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
9cc328eef6 radeonsi: implement ARB_bindless_texture
This implements the Gallium interface. Decompression of resident
textures/images will follow in the next patches.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
77bbdcdfcd radeonsi: add a slab allocator for bindless descriptors
For each texture/image handles, we need to allocate a new
buffer for the bindless descriptor. But when the number of
buffers added to the current CS becomes high, the overhead
in the winsys (and in the kernel) is important.

To reduce this bottleneck, the idea is to suballocate the
bindless descriptors using a slab similar to the one used
in the winsys.

Currently, a buffer can hold 1024 bindless descriptors but
this limit is arbitrary and could be changed in the future
for some reasons. Once a slab is allocated the "base" buffer
is added to a per-context list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
86d7b7f01a radeonsi: add si_set_shader_image_desc() helper
To share some common code between bound and bindless images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
410b4ec06d radeonsi: add si_set_sampler_view_desc() helper
To share some common code between bound and bindless textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
2ce04d7c1a radeonsi: add si_init_descriptor_list() helper
This will be used in order to initialize resident descriptors
for bindless textures/images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
08ba871549 st/mesa: enable ARB_bindless_texture
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
b1288fad3c st/mesa: disable per-context seamless cubemap when using texture handles
The ARB_bindless_texture spec say:

   "If ARB_seamless_cubemap (or OpenGL 4.0, which includes it) is
    supported, the per-context seamless cubemap enable is ignored
    and treated as disabled when using texture handles."

   "If AMD_seamless_cubemap_per_texture is supported, the seamless
    cube map texture parameter of the underlying texture does apply
    when texture handles are used."

The per-context seamless cubemap flag should only be enabled for
bound textures/samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
76b8758253 st/mesa: make bindless samplers/images bound to units resident
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
66a2589d00 st/mesa: add infrastructure for storing bound texture/image handles
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
b6b915afa4 st/mesa: add st_create_{texture,image}_handle_from_unit() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
6f96bd7318 st/mesa: add st_convert_image_from_unit() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
32b4aa3499 st/mesa: make convert_sampler_from_unit() non-static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
9c1558d222 st/mesa: make update_single_texture() non-static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
3dd062ce2a st/mesa: implement ARB_bindless_texture
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
6447abf373 tgsi/scan: record bindless samplers/images usage
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
dd1ec664f5 st/glsl_to_tgsi: teach rename_temp_registers() about bindless samplers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
77cbded995 st/glsl_to_tgsi: teach the DCE pass about bindless samplers/images
When a texture (or an image) instruction uses a bindless sampler
(respectively a bindless image), make sure the DCE pass won't
remove code when the resource is a temporary variable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
5d59226a7f st/glsl_to_tgsi: add support for bindless pack/unpack operations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
afafcbae30 st/glsl_to_tgsi: add support for bindless images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
d2f84d541e st/glsl_to_tgsi: add support for bindless samplers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
556f70b404 tgsi/ureg: accept TGSI_FILE_{CONSTANT,INPUT} for dst registers
For example, TGSI_OPCODE_STORE for bindless images might use
a constant buffer or a shader input.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
ed4fbb84d1 tc: add ARB_bindless_texture support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
e53e374b26 trace: add ARB_bindless_texture support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
02743d63cc ddebug: add ARB_bindless_texture support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
8a68b4de08 gallium: add ARB_bindless_texture interface
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
973822bcee gallium: add PIPE_CAP_BINDLESS_TEXTURE
Whether bindless texture operations are supported by the
underlying driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
990c8d15ac mesa: fix setting uniform variables for bindless samplers/images
This fixes a 64-bit vs 32-bit mismatch when setting an array
of bindless samplers. Also, we need to unconditionally set
size_mul to 2 when the underlying uniform is bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
fe9f7095e8 mesa: handle bindless uniforms bound to texture/image units
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
804e6f2b76 mesa: associate uniform storage to bindless samplers/images
When a bindless sampler/image is bound to a texture/image unit,
we have to overwrite the constant value by the resident handle
directly in the constant buffer before the next draw.

One solution is to keep track of a pointer to the data.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
878a6e6eed mesa: pass gl_program to _mesa_associate_uniform_storage()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
70f2573103 mesa: update textures for bindless samplers bound to texture units
This is analogous to the existing SamplerUnits and SamplerTargets,
but it loops over bindless samplers bound to texture units.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
9eaad42c58 mesa: add update_single_program_texture_state() helper
This will also be used for looping over bindless samplers bound
to texture units.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
bf60db5a4b mesa: add update_single_shader_texture_used() helper
This will also be used for looping over bindless samplers bound
to texture units.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
064d6263c5 glsl: add ir_variable::contains_bindless()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
31154f0975 glsl: set the explicit binding value for bindless samplers/images
This handles a situation like:

layout (bindless_sampler, binding = 7) uniform sampler2D;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
e3c6fba5d6 glsl: pass the ir_variable object to set_opaque_binding()
In order to set the explicit binding value for bindless
samplers/images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
3a087dd7a4 glsl: process uniform images declared bindless
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
9e756de7d1 glsl: process uniform samplers declared bindless
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
444b703a88 mesa: add infrastructure for bindless samplers/images bound to units
Yes, ARB_bindless_texture allows to do this. In other words, in
a situation like:

layout (bindless_sampler) uniform sampler2D tex;

The 'tex' sampler uniform can be either set with glUniform1()
(old-style bound samplers) or with glUniformHandleui() (resident
handles).

When glUniform1() is used, we have to somehow make the texture
resident "under the hood". This is done by requesting a texture
handle to the driver, making the handle resident in the current
context and overwriting the value directly in the constant buffer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
4fe2a6ba7a mesa: store bindless samplers as PROGRAM_UNIFORM
Old-style samplers (ie. bound samplers) are stored as
PROGRAM_SAMPLER, while bindless ones are PROGRAM_UNIFORM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
156bcbaca6 mesa: keep track of the current variable in add_uniform_to_shader
Bindless samplers are considered PROGRAM_UNIFORM but
add_uniform_to_shader::visit_field() is based on glsl_type.

Because only ir_variable knows if the uniform variable is
bindless via ir_variable::bindless, store it instead of
adding a new parameter to visit_field().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
41257fddc8 mesa: refuse to change tex buffers when a handle is allocated
The ARB_bindless_texture spec says:

   "The error INVALID_OPERATION is generated by BufferData if it is
    called to modify a buffer object bound to a buffer texture while
    that texture object is referenced by one or more texture handles."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
028a9b54c4 mesa: refuse to change textures when a handle is allocated
The ARB_bindless_texture spec says:

   "The error INVALID_OPERATION is generated by TexImage*, CopyTexImage*,
    CompressedTexImage*, TexBuffer*, TexParameter*, as well as other
    functions defined in terms of these, if the texture object to be
    modified is referenced by one or more texture or image handles."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
eb9c708ee2 mesa: refuse to update tex parameters when a handle is allocated
The ARB_bindless_texture spec says:

   "The ARB_bindless_texture spec says: "The error INVALID_OPERATION
    is generated by TexImage*, CopyTexImage*, CompressedTexImage*,
    TexBuffer*, TexParameter*, as well as other functions defined in
    terms of these, if the texture object to be modified is referenced
    by one or more texture or image handles."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
67ab372c60 mesa: refuse to update sampler parameters when a handle is allocated
The ARB_bindless_texture spec says:

   "The error INVALID_OPERATION is generated by SamplerParameter* if
    <sampler> identifies a sampler object referenced by one or more
    texture handles."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
326a82a255 mesa: add support for glUniformHandleui64*ARB()
Bindless sampler/image handles are represented using 64-bit
unsigned integers.

The ARB_bindless_texture spec says:

   "The error INVALID_OPERATION is generated by UniformHandleui64{v}ARB
   if the sampler or image uniform being updated has the "bound_sampler"
   or "bound_image" layout qualifier"."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:36 +02:00
Samuel Pitoiset
afb141156f mesa: add support for unsigned 64-bit vertex attributes
This adds support in the VBO and array code to handle unsigned
64-bit vertex attributes as specified by ARB_bindless_texture.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:35 +02:00
Samuel Pitoiset
1fe7b1f972 mesa: implement ARB_bindless_texture
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:35 +02:00
Samuel Pitoiset
6649b840c3 mesa/util: add a hash table wrapper which support 64-bit keys
Needed for bindless handles which are represented using
64-bit unsigned integers. All hash table implementations should
be uniformized later on.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:35 +02:00
Samuel Pitoiset
eeb34af5be mesa: move some hash declarations to hash.h
These will be used by the bindless hash tables to initialize
the default deleted key value.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-14 10:04:35 +02:00
Samuel Pitoiset
30471eb745 mesa/util: add new util_dynarray_delete_unordered helper
This helper function will be used for managing dynamic arrays of
resident texture/image handles.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:35 +02:00
Samuel Pitoiset
5f249b9f05 mapi: add GL_ARB_bindless_texture entry points
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-14 10:04:35 +02:00
Aaron Watry
d364ab4a61 clover/device: Get device/host unified memory from pipe driver
clinfo no longer reports my discrete GCN card as unified memory

Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-13 21:26:09 -05:00
Henri Verbeet
1307ed430a gallium/radeon: Include the family name in the renderer string if it's not equal to the marketing name.
The "family" name is often more informative than the "marketing" name. More
importantly, applications, like for example Wine, may recognise GPUs based on
the existing "family" names.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
2017-06-13 19:23:18 +02:00
Brian Paul
def8d1d23f gallium/docs: clarify TGSI_SEMANTIC_SAMPLEMASK, again
I've since discovered the fragment shader sample mask system value (which
corresponds to gl_SampleMaskIn).

v2: It's a system value, not a shader input.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-13 08:02:43 -06:00
Brian Paul
26500c3fad st/mesa: unmap the stream_uploader buffer before drawing
Some drivers require that the vertex buffers be unmapped prior to
drawing.  This change unmaps the stream_uploader buffer after we've
uploaded the zero-stride attributes (unless the driver supports
rendering with mapped buffers).

This fixes a regression in the VMware driver since 17f776c27b.
Some Mesa demos such as mandelbrot and brick would display black
quads instead of the expected rendering.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-13 07:52:54 -06:00
Brian Paul
e5eb9b4363 gallium/util: whitespace, formatting fixes in u_upload_mgr.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-13 07:52:54 -06:00
Eric Engestrom
40a8385d8b egl: improve dri2_fallback_swap_buffers_with_damage()
Let's (try to) set damages before swapping buffers.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-13 12:12:50 +01:00
Jose Fonseca
f2d71df0ca softpipe: Match pipe_context::render_condition prototype.
To silence compiler warnings.  Trivial.
2017-06-13 11:53:16 +01:00
Jose Fonseca
e1d4c966dc llvmpipe: Match pipe_context::render_condition prototype.
To silence compiler warnings.  Trivial.
2017-06-13 11:53:07 +01:00
Samuel Pitoiset
6d8a387f78 st_glsl_to_tgsi: init index to 0 before get_deref_offsets()
Fixes: 8ec4975cd8 ("st_glsl_to_tgsi: don't try and pass 32-bit values to get_deref_offsets")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101401
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-13 17:36:29 +09:00
Nicolai Hähnle
8dddb9788a glsl: simplify an assertion in lower_ubo_reference
Struct types are now equal when they're structurally equal.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:53 +02:00
Nicolai Hähnle
de32c8378c glsl: simplify validate_intrastage_arrays
Struct types are now equal when they are structurally equal.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:50 +02:00
Nicolai Hähnle
d21a35d63c glsl: simplify varying matching
Unnamed struct types are now equal if they have the same field.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:48 +02:00
Nicolai Hähnle
f97c92ae11 glsl: remove redundant record_compare check when linking globals
Unnamed struct types are now equal across stages based on the fields they
contain, so overriding the type to make sure names match has become
unnecessary.

The check was originally introduced in commit 955c93dc08 ("glsl: Match
unnamed record types across stages.")

v2: clarify the commit message

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:45 +02:00
Nicolai Hähnle
835b1435f2 glsl: stop considering unnamed and named structures equal
Previously, if an unnamed and a named struct contained the same fields,
they were considered the same type during linking of globals.

The discussion around commit e018ea81bf ("glsl: Structures must have
same name to be considered same type.") doesn't seem to have considered
this thoroughly, and I see no evidence that an unnamed struct should
ever be considered to be the same type as a named struct.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:40 +02:00
Nicolai Hähnle
77ea2ada5a glsl: give all unnamed structs the same name
As a result, unnamed structs defined in different places of the program
are considered the same types if they have the same fields in the same
order.

This will simplify matching of global variables whose type is an unnamed
struct.

It also fixes a memory leak when the same shader containing unnamed
structs is compiled over and over again: instead of creating a new type
each time, the existing type is re-used.

Finally, this does have the effect that some previously rejected programs
are now accepted, such as:

   struct {
      float a;
   } s1;

   struct {
      float a;
   } s2;

   s2 = s1;

C/C++ do not allow that, but GLSL does seem to want to treat unnamed
structs with the same fields as the same type at least during linking
(and apparently, some applications require it), so it seems odd to treat
them as different types elsewhere.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:36 +02:00
Nicolai Hähnle
597b2486b8 glsl: do not add unnamed struct types to the symbol table
We removed the need for lookups, and we will assign them all the same
name in the future.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:32 +02:00
Nicolai Hähnle
0cb1f25d86 glsl: do not lookup struct types by typename
This changes the logic during the conversion of the declaration list

   struct S {
      ...
   } v;

from AST to IR, but should not change the end result.

When assigning the type of v, instead of looking `S' up in the symbol
table, we read the type from the member variable of ast_struct_specifier.

This change is necessary for the subsequent change to how anonymous types
are handled.

v2: remove a type override when redefining a structure; should be
    the same type in that case anyway

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:29 +02:00
Nicolai Hähnle
d6ec0aa7ed glsl: fix a race condition when inserting new types
By splitting glsl_type::mutex into two, we can avoid dropping the hash
mutex while creating the new type instance (e.g. struct/record,
interface).

This fixes a time-of-check/time-of-use race where two threads would
simultaneously attempt to create the same type but end up with different
instances of glsl_type.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-13 09:35:10 +02:00
Timothy Arceri
2e28e8b199 st/mesa: skip texture validation logic when nothing has changed
Based on the same logic in the i965 driver 2f225f6145 and
16060c5adc.

perf reports st_finalize_texture() going from 0.60% -> 0.16% with
this change when running the Xonotic benchmark from PTS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-13 11:24:32 +10:00
Dave Airlie
95c0591087 ac/gpu: drop duplicated code line.
has_hw_decode is assigned twice.

Pointed out by coverity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 10:01:40 +10:00
Dave Airlie
9cce302951 radv: move assert down in radv_bind_descriptor_set
coverity complains about the deref before NULL check.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 10:01:36 +10:00
Dave Airlie
b9e76b0c44 radv: return correct error on invalid handle from vkAllocateMemory
Coverity pointed out this was returning uninitialised.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 09:30:19 +10:00
Dave Airlie
8ec4975cd8 st_glsl_to_tgsi: don't try and pass 32-bit values to get_deref_offsets
Just use a temporary 16-bit index.

This fixes coverity issue, pointed to me by Ilia.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 09:29:54 +10:00
Dave Airlie
ca69b5e78c u_dynarray: fix coverity warning about ignoring return value from reralloc
>>>     Ignoring storage allocated by "reralloc_size(buf->mem_ctx, buf->data, buf->size)" leaks it.

Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 06:40:25 +10:00
Dave Airlie
53587b7105 glsl/lower_distance: only set max_array_access for 1D clip dist arrays
The max_array_access field applies to the first dimension, which means
we only want to set it for the 1D clip dist arrays.

This fixes an ir_validate assert seen with
KHR-GL44.cull_distance.functional
on nouveau and radeon with debug builds.

Fixes: a08c4ebbe (glsl: rewrite clip/cull distance lowering pass)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-12 20:37:06 +01:00
Lionel Landwerlin
1c5d4c9d74 i965: fix missing break
Pretty obvious missing break statement.

CID: 1412564
Fixes: 641405f797 "i965: Use the new tracking mechanism for HiZ"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-12 20:30:19 +01:00
Marek Olšák
4951b0adbd radeonsi: pack si_context better
there isn't much to gain here

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
6d43d352cc radeonsi: pack si_framebuffer better
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
ca815f1ead radeonsi: pack si_sampler_view better
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
29bf2530d8 radeonsi: pack si_buffer_resources better
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
cf5ce61148 radeonsi: pack struct si_descriptors better
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
217114dd73 radeonsi: pack struct si_vertex_elements better
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
e80a056ff9 radeonsi: replace si_vertex_elements::elements with separate fields
It makes si_vertex_elements a little smaller.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
c8b6f42e25 radeonsi: rename si_vertex_element -> si_vertex_elements
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
7be6186e0c radeonsi: allocate si_state_rasterizer::pm4_poly_offset only when needed
Each element has over 700 bytes.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
a828f5d783 radeonsi: pack si_state_rasterizer fields
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
6b6fed3a3c radeonsi: remove 8 bytes from si_shader_key with uint32_t ff_tcs_inputs_to_copy
The previous patch helps with this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
7b2240ac9c radeonsi: use uint32_t to declare si_shader_key.opt.kill_outputs
the next patch will benefit from this

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
1621b33d73 radeonsi: remove 8 bytes from si_shader_key by flattening opt.hw_vs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
30882ba0dd radeonsi: don't emit DB_STENCIL_CONTROL if it has no effect
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
6743dc01fd radeonsi: fix missing num_L2_invalidates increment
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
c503381864 radeonsi: get rid of more compressed_colortex_mask names
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
3d8259194d gallium/noop: fix sampler views
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
7448342a1f gallium/docs: clarify gen_name/get_vendor/get_device_vendor behavior
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
0d62e8a727 st/mesa: call check_program_state only when needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
9a22c85618 r600g: set pipe_context::priv = NULL
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101254

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Marek Olšák
e8be83f7f8 vl,omx,va,vdpau,xvmc: don't set the priv pointer in context_create
Unused and radeonsi ignores it anyway.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 18:24:37 +02:00
Juan A. Suarez Romero
621a784529 r600/eg: distribute egd_tables.py in the dist file
Otherwise, `make distcheck` will fail.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-12 11:35:24 +02:00
Juan A. Suarez Romero
4152edbcde i965: include gen4_blorp_exec.h into EXTRA_DIST
Otherwise, `make distcheck` will fail.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-12 10:32:18 +02:00
Kenneth Graunke
b7153c3e9f i965: Call intel_prepare_render() from intel_update_state()
The resolve code looks at the current color draw buffers.  These are not
valid until intel_prepare_render() is called.  You can end up with one
color buffer bound, but where the renderbuffer has zero width/height and
no miptree allocated.

You can get a call chain like: _mesa_Clear -> _mesa_update_state ->
intel_update_state, where no brw driver hooks were called, so there is
no other point at which we could have called this.

Fixes crashes in KWin where Clear was causing intel_disable_rb_aux_buffer
to crash on irb != NULL but irb->mt == NULL.

According to Tapani, this also fixes crashes seen on Android.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-12 01:10:36 -07:00
Grazvydas Ignotas
fae3b13905 radv: fix trace dumping for !use_ib_bos
Fixes trace dumping crash for SI or when RADV_DEBUG=noibs is set.

Fixes: 97dfff5410 "radv: Dump command buffer on hang."
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-11 23:07:09 +03:00
Grazvydas Ignotas
f56aa25ac5 radv: don't even attempt to prefetch on SI
Before bcae327469 this was emitting CP DMA packet even on SI, but
apparently hasn't caused too many problems. After that commit the
CP DMA code now always sets the CIK+ only bit for prefetch. Just
follow radeonsi there and don't try to prefetch at all.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Fixes: bcae327469 "radv: realign cp dma code with radeonsi"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-11 14:28:40 +03:00
Grazvydas Ignotas
f490200973 radv: assert on CP_DMA_USE_L2 for SI
The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
is for CIK+ only, so let's assert on earlier ASICs.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-11 14:28:08 +03:00
Harish Krupo
9827547313 egl/android: support for EGL_KHR_partial_update
This patch adds support for the EGL_KHR_partial_update extension for
android platform. It passes 36/37 tests in dEQP for EGL_KHR_partial_update.
1 test not supported.

v2: add fallback for eglSetDamageRegionKHR (Tapani)

v3: The native_window_set_surface_damage call is available only from
    Android version 6.0. Reintroduce the ANDROID_VERSION guard and
    advertise extension only if version is >= 6.0. (Emil Velikov)

v4: use newly introduced ANDROID_API_LEVEL guard rather than
    ANDROID_VERSION guard to advertise the extension.The extension
    is advertised only if ANDROID_API_LEVEL >= 23 (Android 6.0 or
    greater). Add fallback function for platforms other than Android.
    Fix possible math overflow. (Emil Velikov)
    Return immediately when n_rects is 0. Place function's entrypoint
    in alphabetical order. (Eric Engestrom)

v5: Replace unnecessary calloc with malloc (Eric)
    Check for BAD_ALLOC error (Emil)
    Check for error in native_window_set_damage_region. (Emil, Tapani,
    Eric).

Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-11 01:02:09 +01:00
Marius Gräfe
f3c0bbe18a gallium: fixed modulo zero crashes in tgsi interpreter (v2)
softpipe throws integer division by zero exceptions on windows
when using % with integers in a geometry shader.

v2: Made error results consistent with existing div/mod zero handling in
    tgsi. 64 bit signed integer division by zero returns zero like in
    micro_idiv, unsigned returns ~0u like in micro_udiv.
    Modulo operations always set all result bits to one (like in
    micro_umod).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-06-10 16:40:13 +02:00
Grazvydas Ignotas
29b9f35704 nir: make various getters take const pointers
This will allow to constify other things.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-10 16:48:45 +03:00
Ben Widawsky
e179a3438a i965/cnl: Add a preliminary device for Cannonlake
v2 (Anuj):
Rebased on master and updated pci ids
Remove redundant initialization of max_wm_threads to 64 * 12.
For gen9+ max_wm_threads are initialized in gen_get_device_info().

v3 (Anuj):
Move the patch to end of series.
Remove unused gt1, gt2, gt3 functions.
Remove l3_banks variable. Variable is now available on master.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Jason Ekstrand
f2cbf738b4 anv: Don't advertise support on anything above gen9
This will prevent the driver from even trying to work on Cannon Lake
until we get actual support added.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:03:00 -07:00
Anuj Phogat
9acc93feeb i965/cnl: Enable CCS_E and RT support for few formats
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
61f171292e i965/cnl: Reformat surface_format_info table to accomodate gen10+
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
f9e31a26d4 i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3
v1: By Ben Widawsky <benjamin.widawsky@intel.com>
v2: v1 had an assert only for VS. Add the restriction for GS, HS and
    DS as well and make sure the allocated sizes are not multiple of 3.
v3: Move the entry_size checks in to compiler code (Ken)

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-09 16:02:59 -07:00
Anuj Phogat
b76659997e i965/cnl: Don't resolve single sampled color rb in case of sRGB formats
As sRGB now supports lossless compression, we also need to stop resolving
single sampled color render buffers for sRGB formats in Gen 10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Ben Widawsky
640f5d3957 i965/cnl: Implement depth count workaround
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
8c43e33560 i965/cnl: Start using CNL MOCS defines
CNL MOCS defines are duplicates of SKL MOCS defines.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
111881abac i965/cnl: Handle gen10 in switch cases across the driver
V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec()
    gen10_init_atoms() (Jason)
    Remove Vulkan changes. Do them later in a separate patch.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
30e749c8f1 i965/cnl: Update few assertions
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:59 -07:00
Anuj Phogat
56b4d82729 i965/cnl: Add cnl bits in aubinator
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat
dd6c27ace1 i965/cnl: Add pci id for INTEL_DEVID_OVERRIDE
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat
dc83ce7a16 i965/cnl: Wire up android Mesa build files for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat
e01c5a6824 i965/cnl: Wire up Mesa build files for gen10
V2: Remove isl_gen10.c and isl_gen10.h

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-09 16:02:58 -07:00
Anuj Phogat
2417d5ca19 intel/genxml: Update genx_bits for gen10+
This commit adds a gen10 case to the switch statement and
drops some unneeded code for handling gen numbers which
doesn't work on gen10 and above.

V2: Drop "z = float(z)" and the "z *= 10" lines

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat
98b95a3735 i965/cnl: Add gen10 specific function declarations
These declarations will help the code start compiling
once we wire up the makefiles for gen10. Later patches
will start using these functions for gen10.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat
2704ccc646 i965/cnl: Include gen10_pack.h
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Anuj Phogat
a48cb9cf7f i965/cnl: Define genX(x) and GENX(x) for gen10
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 16:02:58 -07:00
Jason Ekstrand
aa416f515a i965/genxml: Add gen10.xml
V2(Anuj):
Add default value for length of 3DPRIMITIVE command
Add values for 'Attribute Active Component Format'
Rename few fields to match gen9.xml

V3 (Ander Conselvan de Oliveira)
Add gen10 alias for MOCS
Make 3DSTATE_CONSTANT_BODY on Gen10 use arrays

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-09 16:00:49 -07:00
Ben Widawsky
d968f072bc i965: Make feature macros gen8 based
All the "features" of the hardware are similar starting with GEN8, so remove as
much of the GEN9 uniqueness as possible. This makes implementing future gen
platforms a bit easier.

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-09 15:27:14 -07:00
Dave Airlie
51553c0bea radv: set fmask state to all 0s when no fmask. (v2)
The shader reads the descriptor to decide if it should take the
fmask value, however we weren't initing it always, which meant
random crap, esp with MSAA depth textures.

Fixes random hangs with:
dEQP-VK.glsl.builtin_var.fragdepth.*

v2: check fmask_state is not NULL

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-09 20:41:55 +01:00
Matt Turner
71651b3139 i965: Temporarily disable async mappings on non-LLC
Fixes regressions from commits e0a9b261e5 and a16355d67d by
neutering async mappings on non-LLC to be synchronous, like they were
before those two commits. :(

The failing tests include

piglit-test piglit.spec.nv_primitive_restart.primitive-restart-vbo_index_only
piglit-test piglit.spec.nv_primitive_restart.primitive-restart-vbo_combined_vertex_and_index
piglit-test piglit.spec.nv_primitive_restart.primitive-restart-vbo_separate_vertex_and_index
piglit-test piglit.spec.nv_primitive_restart.primitive-restart-vbo_vertex_only
piglit-test piglit.spec.arb_pixel_buffer_object.texsubimage-unpack pbo
2017-06-09 12:14:28 -07:00
Rafael Antognolli
d42fc65bb3 mesa/main/debug: Check if we successfully reopened the ppm file.
Since we created the file, we should be able to reopen it for appending, but
some weird filesystem error could cause that to be false. So simply check
whether we could reopen it or not.

CID: 1177144
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-09 10:21:16 -07:00
Brian Paul
81e15a5dea tgsi: clarify TGSI_SEMANTIC_SAMPLEMASK documentation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 08:51:56 -06:00
Frank Richter
0ef39e588f gallium/wgl: Allow context creation even if SetPixelFormat() wasn't called
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101326
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-09 08:51:45 -06:00
Varad Gautam
f84bb6a9d9 st/dri: support format modifier queries
ask the driver for supported modifiers for a given format.

v2: move to __DRIimageExtension v16.
v3: fail if the supplied format is not supported by driver.
v4: purge PIPE_CAP_QUERY_DMABUF_ATTRIBS.
v5:
- move to __DRIimageExtension v15, pass external_only to the driver.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> (v4)
Cc: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
e0965a2c8e gallium: introduce format modifier querying
format modifiers tokens are driver specific, and hence, need to come
in from the driver. this allows drivers to be queried for supported
format modifiers for EGL_EXT_image_dma_buf_import_modifiers.

v2: rebase to master.
v3: drivers must return false on query failure.
v4: use pscreen->is_format_supported instead of adding a separate
    format query handle, remove PIPE_CAP_QUERY_DMABUF_ATTRIBS.
    (Lucas Stach)
v5: add external_only parameter.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
cf748242d1 st/dri: support format queries
ask the driver for supported dmabuf formats

v2: rebase to master.
v3: return false on failure.
v4: use pscreen->is_format_supported instead of adding a new query.
    (Lucas Stach)
v5: stylefix to conform to formatting rules (Brian Paul). add fourcc list
    here instead of using struct image_format from v4.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> (v4)
Cc: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
82b3d1fa9a st/dri: implement DRIimage creation from dmabufs with modifiers
support importing dmabufs into DRIimage while taking format modifiers
in account, as per DRIimage extension version 15.

v2: initialize winsys modifier to DRM_FORMAT_MOD_INVALID (Daniel Stone)
v3: do not bump DRIimageExtension version. split out winsys changes.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
f61a8ba168 st/dri: implement createImageWithModifiers in DRIimage
adds a pscreen->resource_create_with_modifiers() to create textures
with modifier.

v2:
- stylefixes (Emil Velikov)
- don't return selected modifier from resource_create_with_modifiers. we can
  use the winsys_handle to get this.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de> (v1)
Cc: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
d33fe8b84e st/dri: enable DRIimage modifier queries
return the modifier selected by the driver when creating this image.

v2: since we can use winsys_handle->modifier to serve these, remove
    DRIimage->modifier from v1.
    use DRM_API_HANDLE_TYPE_KMS instead of DRM_API_HANDLE_TYPE_FD to avoid
    ownership transfer. (Lucas)

Suggested-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Varad Gautam
3f8513172f gallium/winsys/drm: introduce modifier field to winsys_handle
we use this to import resources with format modifiers, and to support
per-resource modifier queries.

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-09 14:12:37 +01:00
Samuel Pitoiset
cde963ec35 mesa: make use of NewScissorTest driver flags
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:33:28 +02:00
Samuel Pitoiset
328191f26c mesa: make use of NewScissorRect driver flags
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:33:26 +02:00
Samuel Pitoiset
4c037af9cc mesa: add gl_driver_flags::NewScissor{Rect,Test}
_NEW_SCISSOR mesa flag is set when a scissor test is enabled/disabled
or when a new rectangle is defined. However, it triggers too much
changes in the state tracker.

Actually, ST_NEW_RASTERIZER should only be called when a scissor
test is enabled/disabled, while ST_NEW_SCISSOR should be called
in both situations.

In other words, this will avoid to update the rasterizer every
time a new rectangle is defined using glScissor*().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:33:22 +02:00
Tapani Pälli
8fac894f9b egl: fix _eglQuerySurface in EGL_BUFFER_AGE_EXT case
Specification states that in case of error, value should not be
written, patch changes buffer age queries to return -1 in case of
error so that we can skip changing the value.

In addition, small change to droid_query_buffer_age to return 0
in case buffer does not have a back buffer available.

Fixes:
   dEQP-EGL.functional.negative_partial_update.not_postable_surface

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
2017-06-09 07:39:22 +03:00
Dave Airlie
c2464271a0 radv: introduce perf test env var and allow to enable chaining
We have some features that seem to slow things down or cause other
possible undesireable side effects, but it would be nice to test
games etc with them easily.

I forsee multisample DCC and maybe some shader opt changes using this.

For now use it for batch chaining.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-09 02:15:25 +01:00
Timothy Arceri
d0a26edc25 mesa: add KHR_no_error support to glDrawRangeElements*()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-09 09:58:07 +10:00
Timothy Arceri
87cb44d9b0 mesa: rework _ae_invalidate_state() so that it just sets a dirty flag
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
b57bc7473b mesa: remove redundant _ae_invalidate_state() call
The FLUSH_VERTICES(ctx, _NEW_ARRAY) above this will already cause
this to be called.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
bc70bad59b mesa: inline vbo_exec_invalidate_state() and call from mesa core
Rather than calling it indirectly in each driver.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
99987fe92e mesa: rework vbo_exec_init()
Here we make some assumptions about the AEcontext and set the
recalculate bools directly.

Some formating fixes are also made while we are here.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
f77740f14b mesa: stop passing state bitfield to UpdateState()
The code comment which seems to have been added in cab974cf6c
(from year 2000) says:

   "Set ctx->NewState to zero to avoid recursion if
   Driver.UpdateState() has to call FLUSH_VERTICES().  (fixed?)"

As far as I can tell nothing in any of the UpdateState() calls
should cause it to be called recursively.

V2: add a wrapper around the osmesa update function so it can still
    be used internally.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
f627ac6e35 st/mesa: add st_invalidate_buffers() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri
df27aba422 r200/radeon: stop calling _ae_invalidate_state() directly
It is already called via _vbo_InvalidateState().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-09 09:13:46 +10:00
Tim Rowley
0b80b02502 swr: relax c++ requirement from c++14 to c++11
Remove c++14 generic lambda to keep compiler requirement at c++11.

No regressions on piglit or vtk test suites.

Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

CC: mesa-stable@lists.freedesktop.org
2017-06-08 18:07:52 -05:00
Juan A. Suarez Romero
a625d58ee1 radeonsi: call LLVMAddEarlyCSEMemSSAPass only for LLVM >= 4.0
LLVMAddEarlyCSEMemSSAPass() is defined in LLVM 4.0.

Fixes: 257b538 ("radeonsi: do EarlyCSEMemSSA LLVM pass)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-06-08 23:32:32 +02:00
Marek Olšák
6940361796 gallium/radeon: don't allocate HTILE in a separate buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák
c6451b1209 radeonsi: rename depth decompress functions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák
d8a577d96e radeonsi: rename shader resource decompress masks to their true meaning
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák
da26de5ff7 radeonsi: rename is_compressed_colortex -> color_needs_decompression
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák
391673af7a radeonsi: disable the patch ID workaround on SI when the patch ID isn't used (v2)
The workaround causes a massive performance decrease on 1-SE parts.
(Cape Verde, Hainan, Oland)

The performance regression is already part of 17.0 and 17.1.

v2: check tess_uses_prim_id

Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák
4b8d0c2b1d radeonsi: don't update dependent states if it has no effect (v2)
This and the previous clip_regs commit decrease IB sizes and the number of
si_update_shaders invocations as follows:

                 IB size   si_update_shaders calls
Borderlands 2      -10%            -27%
Deus Ex: MD         -5%            -11%
Talos Principle     -8%            -30%

v2: always dirty cb_render_state in set_framebuffer_state

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Varad Gautam
f804e0672e i965: Add format/modifier advertising
v2: Rebase and reuse tiling/modifier map. (Daniel Stone)
v3: bump DRIimageExtension to version 15, fill external_only array.
v4: Y-tiling works since gen 6

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Varad Gautam
c303772e5b i965: Support dmabuf import with modifiers
Add support for createImageFromDmaBufs2, adding a modifier to the
original, and allow importing CCS resources with auxiliary data from
dmabufs.

v2: avoid DRIimageExtension version bump, pass single modifier to
    createImageFromDmaBufs2.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone
f58e6358bf i965: Improve same-buffer restriction for imports
Intel hardware requires that all planes of an image come from the same
buffer, which is currently implemented by testing that all FDs are
numerically the same.

However, when going through a winsys (e.g.) or anything which transits
FDs individually, the FDs may be different even if the underlying buffer
is the same.

Instead of checking the FDs for equality, we must check if they actually
point to the same buffer (Jason).

Reviewed-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Ben Widawsky
37cdcaf386 i965: Allocate tile aligned height
This patch shouldn't actually do anything because the libdrm function
should already do this alignment. However, it preps us for a future
patch where we add in the CCS AUX size, and in the process it serves as
a good place to find bisectable issues if libdrm or kernel does
something incorrectly.

v2: Do proper alignment for X tiling, and make sure non-tiled case is
handled (Jason)
v3: Rebase (Daniel)

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone
78703881ff i965: Move fallback size assignment out of bufmgr
The bufmgr took a mandatory size argument, which would only be used if
the kernel size query failed, i.e. an older kernel. It didn't actually
check that the BO size was sufficient for use.

Pull the check out of the bufmgr, and actually check that the BO is
sufficiently-sized for our import one level up. This also resolves a
chicken/egg we have when importing bufers without explicit modifiers,
namely that we need the tiling mode to calculate the size, but we need
the BO imported to query the tiling mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone
6b18d4aaec i965: Invert image modifier/tiling inference
When allocating images, we record a tiling mode and then work backwards
to infer the modifier. Unfortunately this is the wrong way around, since
it is a one:many mapping (e.g. TILING_Y can be plain Y-tiling, or
Y-tiling with CCS).

Invert the mapping, so we record a modifier first and then map this to a
tiling mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone
11e549ae3f egl/dri2: Avoid sign extension when building modifier
Since the EGL attributes are signed integers, a straight OR would
also perform sign extension,

Fixes: 6f10e7c37a ("egl/dri2: Create EGLImages with dmabuf modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Vinson Lee
142536a0e3 i915g: Add blitter_context argument.
Fix build error.

  CC       i915_surface.lo
i915_surface.c:108:63: error: too few arguments to function call, expected 4, have 3
   util_blitter_default_src_texture(&src_templ, src, src_level);
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                           ^
../../../../src/gallium/auxiliary/util/u_blitter.h:271:1: note: 'util_blitter_default_src_texture' declared here
void util_blitter_default_src_texture(struct blitter_context *blitter,
^

Fixes: a893c91697 ("gallium/u_blitter: use 2D_ARRAY for cubemap blits if possible")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101340
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-08 13:47:39 -07:00
Lucas Stach
978e6876f1 etnaviv: flush resource when binding as sampler view
As TS is also allowed on sampler resources, we need to make sure to resolve
to self when binding the resource as a texture, to avoid stale content
being sampled.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-08 18:29:36 +02:00
Lucas Stach
f25390afa4 etnaviv: don't flush resource to self without TS
A resolve to self is only necessary if the resource is fast cleared, so
there is never a need to do so if there is no TS allocated.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach
0f888ad4be etnaviv: upgrade DISCARD_RANGE to DISCARD_WHOLE_RESOURCE if possible
Stolen from VC4. As we don't do any fancy reallocation tricks yet, it's
possible to upgrade also coherent mappings and shared resources.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach
d4e6de9e38 etnaviv: simplify transfer tiling handling
There is no need to special case compressed resources, as they are already
marked as linear on allocation. With that out of the way, there is room to
cut down on the number of if clauses used.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach
6e628ee3f3 etnaviv: don't read back resource if transfer discards contents
Reduces bandwidth usage of transfers which discard the buffer contents,
as well as skipping unnecessary command stream flushes and CPU/GPU
synchronization.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach
c3b2c7a75f etnaviv: honor PIPE_TRANSFER_UNSYNCHRONIZED flag
This gets rid of quite a bit of CPU/GPU sync on frequent vertex buffer
uploads and I haven't seen any of the issues mentioned in the comment,
so this one seems stale.

Ignore the flag if there exists a temporary resource, as those ones are
never busy.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach
a276c32a08 etnaviv: slim down resource waiting
cpu_prep() already does all the required waiting, so the only thing that
needs to be done is flushing the commandstream, if a GPU write is pending.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Rob Herring
ada3c3aa3d glsl: Fix gl_shader_stage enum unsigned comparison
Replace -1 with MESA_SHADER_NONE enum value to fix sign related warning:

external/mesa3d/src/compiler/glsl/link_varyings.cpp:1415:25: warning: comparison of constant -1 with expression of type 'gl_shader_stage' is always true [-Wtautological-constant-out-of-range-compare]
        (consumer_stage != -1 && consumer_stage != MESA_SHADER_FRAGMENT))) {
         ~~~~~~~~~~~~~~ ^  ~~

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-08 07:26:04 -05:00
Rob Herring
6150ea794b Android: vulkan: fix build error due to extra )
Commit 621b3410f5 ("util/vulkan: Move Vulkan utilities to
src/vulkan/util") broke the Android build with the following error:

build/core/binary.mk:1427: error: external/mesa3d/src/vulkan/Android.mk: libmesa_vulkan_util: Unused source files: util/vk_util.h).

Fixes: 621b3410f5 ("util/vulkan: Move Vulkan utilities to src/vulkan/util")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-08 07:26:04 -05:00
Iago Toral Quiroga
ce53e8e61b Fix glcpp test expectations
With commit f7741985be we have changed some preprocessor
error messages and warnings. Adapt related glcpp tests
expectations accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101336
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-08 09:46:36 +02:00
Vlad Golovkin
f4df2a196e util: make set's deleted_key_value declaration consistent with hash table one
This also silences following clang warnings:
no previous extern declaration for non-static variable 'deleted_key' [-Werror,-Wmissing-variable-declarations]
const void *deleted_key = &deleted_key_value;
            ^
no previous extern declaration for non-static variable 'deleted_key_value'
      [-Werror,-Wmissing-variable-declarations]
uint32_t deleted_key_value;
         ^

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 09:26:44 +02:00
Jason Ekstrand
f1ba51b940 i965: Delete intel_resolve_map
Now that we've moved over to the new array mechanism, it's no longer
needed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
641405f797 i965: Use the new tracking mechanism for HiZ
This is similar to the previous commit only for HiZ.  For HiZ, apart
from everything looking different, there is really only one functional
change:  We now track the ISL_AUX_STATE_COMPRESSED_NO_CLEAR state.
Previously, if you rendered to a resolved slice of the miptree and then
did a fast-clear with a different clear color, that slice would get
resolved even though it hadn't been fast-cleared.  Now that we can track
COMPRESSED_NO_CLEAR, we know that it doesn't have any blocks in the
"clear" state so we can skip the resolve.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
e6c69264ed i965/miptree: Make level_has_hiz take a const miptree
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
b2c6290b01 i965: Wholesale replace the color resolve tracking code
This commit reworks the resolve tracking for CCS and MCS to use the new
isl_aux_state enum.  This should provide much more accurate and easy to
reason about tracking.  In order to understand, for instance, the
intel_miptree_prepare_ccs_access function, one only has to go look at
the giant comment for the isl_aux_state enum and follow the arrows.
Unfortunately, there's no good way to split this up without making a
real mess so there are a bunch of changes in here:

 1) We now do partial resolves.  I really have no idea how this ever
    worked before.  So far as I can tell, the only time the old code
    ever did a partial resolve was when it was using CCS_D where a
    partial resolve and a full resolve are the same thing.

 2) We are now tracking 4 states instead of 3 for CCS_E.  In particular,
    we distinguish between compressed with clear and compressed without
    clear.  The end result is that you will never get two partial
    resolves in a row.

 3) The texture view rules are now more correct.  Previously, we would
    only bail if compression was not supported by the destination
    format.  However, this is not actually correct.  Not all format
    pairs are supported for texture views with CCS even if both support
    CCS individually.  Fortunately, ISL has a helper for this.

 4) We are no longer using intel_resolve_map for tracking aux state but
    are instead using a simple array of enum isl_aux_state indexed by
    level and layer.  This is because, now that we're tracking 4
    different states, it's no longer clear which should be the "default"
    and array lookups are faster than linked list searches.

 5) The new code is very assert-happy.  Incorrect transitions will now
    get caught by assertions rather than by rendering corruption.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
46fd924899 i965: Delete most of the old resolve interface
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
f296c22989 i965: Use the new get/set_aux_state functions for color clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
38563e95d5 i965: Move blorp to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
554f7d6d02 i965: Move depth to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
170e4b366a i965: Move images to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
8cb3b4a586 i965: Move framebuffer fetch to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
79df134d56 i965: Remove an unneeded render_cache_set_check_flush
This is only needed to fix rendering corruptions caused by not flushing
after doing a resolve operation.  The resolve now does all the needed
flushing so this is unnecessary.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
49e4d8cce2 i965: Move color rendering to the new resolve functions
This also removes an unneeded brw_render_cache_set_check_flush() call.
We were calling it in the case where the surface got resolved to satisfy
the flushing requirements around resolves.  However, blorp now does this
itself, so the extra is just redundant.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
c0f5225264 i965: Move texturing to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
421d713eec i965: Use the new resolve function for several simple cases
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
5ec344e420 i965/miptree: Add new entrypoints for resolve management
This commit adds a new unified interface for doing resolves.  The basic
format is that, prior to any surface access such as texturing or
rendering, you call intel_miptree_prepare_access.  If the surface was
written, you call intel_miptree_finish_write.  These two functions take
parameters which tell them whether or not auxiliary compression and fast
clears are supported on the surface.  Later commits will add wrappers
around these two functions for texturing, rendering, etc.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
a59c7f834c intel/isl: Add an enum for describing auxiliary compression state
This enum describes all of the states that a auxiliary compressed
surface can have.  All of the states as well as normative language for
referring to each of the compression operations is provided in the
truly colossal comment for the new isl_aux_state enum.  There is also
a diagram showing how surfaces move between the different states.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
c89b795db4 i965: Combine render target resolve code
We have two different bits of resolve code for render targets: one in
brw_draw where it's always been and one in brw_context to deal with sRGB
on gen9.  Let's pull them together.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
0607ca42da i965: Be a bit more conservative about certain resolves
There are several places where we were resolving the entire miptree
when we really only needed to resolve a single slice.  Let's avoid the
unneeded resolving.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
3b65f9499c i965/blorp: Move MCS allocation earlier for clears
This way it happens before we call get_aux_state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
076defba7a i965/blorp: Refactor do_single_blorp_clear
Previously, we had two checks for can_fast_clear and a tiny bit of
shared code in between.  This commit pulls all of the fast clear code
together and duplicates the tiny bit that declares some surface structs
and calls blorp_surf_for_miptree.  The duplication is no real loss and
we're about to change the two in slightly different ways.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
7a9c37eb7b i965/blorp: Take an explicit fast clear op in resolve_color
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
4afe282a35 i965/miptree: Move color resolve on map to intel_miptree_map
None of the other methods such as blit work with CCS either so we need
to do the resolve for all maps.  This change also makes us only resolve
the one slice we're mapping and not the entire image.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
ad7fa063ae i965: Inline renderbuffer_att_set_needs_depth_resolve
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
c15b2f53f4 i965: Get rid of intel_renderbuffer_resolve_*
There is exactly one caller so it's a bit pointless to have all of this
plumbing.  Just inline it at the one place it's used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
25d00e72e4 i965/miptree: Refactor intel_miptree_resolve_color
The new version now takes a range of levels as well as a range of
layers.  It should also be a tiny bit faster because it only walks the
resolve_map list once instead of once per layer.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
64b829244b i965/miptree: Clean up the depth resolve helpers a little
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
97f6f411db i965/surface_state: Images can't handle CCS at all
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand
5097fcbfdc i965: Mark depth surfaces as needing a HiZ resolve after blitting
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Dave Airlie
cb2a13e895 st_glsl_to_tgsi: cleanup variable storage search.
I forgot to put the cleanup in earlier.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-08 13:29:29 +10:00
Rob Herring
f4b5510872 mesa/main: fix gl_buffer_index enum comparison
For clang, enums are unsigned by default and gives the following warning:

external/mesa3d/src/mesa/main/buffers.c:764:21: warning: comparison of constant -1 with expression of type 'gl_buffer_index' is always false [-Wtautological-constant-out-of-range-compare]
      if (srcBuffer == -1) {
          ~~~~~~~~~ ^  ~~

Replace -1 with an enum value to fix this.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-07 20:44:26 -05:00
Rob Herring
18348a383d glsl: fix bounds check in blob_overwrite_bytes
clang gives a warning in blob_overwrite_bytes because offset type is
size_t which is unsigned:

src/compiler/glsl/blob.c:110:15: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
   if (offset < 0 || blob->size - offset < to_write)
       ~~~~~~ ^ ~

Remove the less than 0 check to fix this.

Additionally, if offset is greater than blob->size, the 2nd check would
be false due to unsigned math. Rewrite the check to avoid subtraction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-07 20:44:26 -05:00
Dave Airlie
4453fbb024 st_glsl_to_tgsi: replace variables tracking list with a hash table
This removes the linear search which is fail when number of variables
goes up to 30000 or so.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-08 07:57:50 +10:00
Dave Airlie
3008161d28 st_glsl_to_tgsi: rewrite rename registers to use array fully.
Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.

Removes this from the profile on some of the fp64 tests

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-08 07:56:33 +10:00
Dave Airlie
3bc7169793 st_glsl_to_tgsi: bump index back up to 32-bit
with some of the fp64 emulation, we are seeing shaders coming in with
> 32K temps, they go out with 40 or so used, but while doing register
renumber we need to store a lot of them.

So bump this fields back up to 32-bit.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-08 07:21:06 +10:00
Marek Olšák
e93a141f64 util/u_queue: fix a use-before-initialization race for queue->threads
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-07 23:19:30 +02:00
Grazvydas Ignotas
19f6cc3cba ac/nir: remove another unused variable
Declared by each loop already.
Trivial.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
2017-06-08 00:02:42 +03:00
Grazvydas Ignotas
5bbbe91799 radv/meta: remove an unused variable
Trivial.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-08 00:02:36 +03:00
Grazvydas Ignotas
7dfa54399c ac/nir: convert several ifs to a switch
Also solve "outinfo may be used uninitialized" warning by putting in an
unreachable().

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-08 00:02:26 +03:00
Grazvydas Ignotas
ae3262c1f2 ac/nir: mark some arguments const
Most functions are only inspecting nir, so nir related arguments can be
marked const. Some more can be done if/when some nir changes are
accepted.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-08 00:02:02 +03:00
Samuel Li
c705caaff9 radeonsi: Use libdrm to get chipset name
v2: Add a func pointer to radeon_winsys to support radeon later.

Change-Id: I614ea71424f9e5c97e4ae68654315d28c89eaa5f
Signed-off-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 21:53:36 +02:00
Thomas Helland
4ba4f0e976 util: Add extern c to u_dynarray.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
cfb696dc82 nir: Delete nir_array.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
e558a7a988 nir: Port to u_dynarray
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
bc3a2be6c9 nir: Remove unused include
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
9cb42ae997 util: Port nir_array functionality to u_dynarray
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
07653f159f util: Remove unused includes and convert to lower-case memory ops
Also, prepare for the next commit by correcting some coding style
changes. This should be all non-functional changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
f0372814a9 util: Move u_dynarray to src/util
This will be used as the basis for unification

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Thomas Helland
a66befc3c8 gallium: Add missing includes
These will need to be in place to avoid regressions when
removing these includes from the u_dynarray

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-07 21:07:24 +02:00
Marek Olšák
bacaceb78a radeonsi: update clip_regs on shader state changes only when it's needed
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:20 +02:00
Marek Olšák
2b7fd9df9a radeonsi: precompute some fields for PA_CL_VS_OUT_CNTL in si_shader_selector
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:18 +02:00
Marek Olšák
140b3c5019 radeonsi: add a new helper si_get_vs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:16 +02:00
Samuel Pitoiset
878bd981bf radeonsi: isolate real framebuffer changes from the decompression passes (v3)
When a stencil buffer is part of the framebuffer state, it is
decompressed but because it's bindless, all draw calls set
stencil_dirty_level_mask to 1.

v2: Marek - set the flags outside the loop
          - also clear and set framebuffer.do_update_surf_dirtiness there
          - do it in the DB->CB copy path too
v3: Marek - save and restore the do_update_surf_dirtiness flag

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:14 +02:00
Marek Olšák
257b538fd2 radeonsi: do EarlyCSEMemSSA LLVM pass
so that LLVM IR looks like CSE has been run on it. It's also recommended
by the instruction combining pass.

This also fixes:
- GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 (crash)
- piglit/spec/arb_shader_ballot/execution/fs-readFirstInvocation-uint-loop (fail)

The code size decrease is positive, the register usage isn't. There is
a decrease in VGPR spilling for Tomb Raider, but increase in DiRT Showdown
and GRID Autosport.

EarlyCSEMemSSA has a -0.01% change in code size compared EarlyCSE.

SGPRS: 1935420 -> 1938076 (0.14 %)
VGPRS: 1645504 -> 1645988 (0.03 %)
Spilled SGPRs: 2493 -> 2651 (6.34 %)
Spilled VGPRs: 107 -> 115 (7.48 %)
Private memory VGPRs: 1332 -> 1332 (0.00 %)
Scratch size: 1512 -> 1516 (0.26 %) dwords per thread
Code Size: 61981592 -> 61890012 (-0.15 %) bytes
Max Waves: 371847 -> 371798 (-0.01 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:09 +02:00
Marek Olšák
e9409c86e7 radeonsi: remove 8 bytes from si_shader_key
We can use a union in si_shader_key::mono.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:17:06 +02:00
Marek Olšák
2b8b9a56ef radeonsi: move PSIZE and CLIPDIST unique IO indices after GENERIC
Heaven LDS usage for LS+HS is below. The masks are "outputs_written"
for LS and HS. Note that 32K is the maximum size.

Before:
  heaven_x64: ls=1f1 tcs=1f1, lds=32K
  heaven_x64: ls=31 tcs=31, lds=24K
  heaven_x64: ls=71 tcs=71, lds=28K

After:
  heaven_x64: ls=3f tcs=3f, lds=24K
  heaven_x64: ls=7 tcs=7, lds=13K
  heaven_x64: ls=f tcs=f, lds=17K

All other apps have a similar decrease in LDS usage, because
the "outputs_written" masks are similar. Also, most apps don't write
POSITION in these shader stages, so there is room for improvement.
(tight per-component input/output packing might help even more)

It's unknown whether this improves performance.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 20:14:15 +02:00
Thomas Hellstrom
2c4ec3f93f svga: Always set the alpha value to 1 when sampling using an XRGB view
If the XRGB view is sampling from an ARGB svga format, change
PIPE_SWIZZLE_W to PIPE_SWIZZLE_1 for all channels.
Previously we unconditionally set PIPE_SWIZZLE_1 on the alpha channel which
could be both insufficient and incorrect.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
df4d6003dc svga: Fix imported surface view creation
When deciding to create a view with or without an alpha channel we need to
look at the SVGA3D format and not the PIPE format.

This fixes the glx-tfp piglit test for dri3/xa.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
c2138a066c svga: Set alpha to 1 for non-alpha views
Gallium RGB textures may be backed by imported ARGB svga3d surfaces. In those
and similar cases we need to set the alpha value to 1 when sampling.

Fixes piglit glx::glx-tfp

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
1887faf73b svga: Allow format differences in 16-bit RGBA surface sharing
For the purpose of surface sharing, treat SVGA3D_R5G6B5 and
SVGA3D_B5G6R5_UNORM as identical formats.
This fixes the following piglit tests with dri3/xa:

glx@glx-visuals-depth -pixmap
glx@glx-visuals-stencil -pixmap

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Singh Rawat <drawat@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
b8b0a3dc5c dri/vmwgfx: Disable a couple of glx extensions also for Ubuntu unity / compiz
It appears like the GLX_EXT_buffer_age extension also prevents Compiz /
Ubuntu Unity from performing partial buffer swaps when it otherwise
feels like doing so. So try to get them back again. We also disable
GLX_OML_sync_control since it appears it had a favourable impact on
gnome-shell.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
37e8341db4 dri: Turn of a couple of glx extensions for gnome-shell on vmwgfx.
Increases performance on vmwgfx since we're avoiding full buffer damage and
since we can't sync to vertical retrace anyway.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
48f4baf63f st/dri: Allow gallium drivers to turn off two GLX extensions
Allow gallium drivers to turn off GLX_EXT_buffer_age and
GLX_OML_sync_control if needed, using driconf.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
9d3f177e4b dri: Optionally turn off a couple of GLX extensions based on driconf options
With GLX_EXT_buffer_age turned on, gnome-shell will use full-screen damage
with GLX, which severely hurts performance with architectures that emulate
page-flips with copies. Like vmware. We would like to be able to turn off that
extension. Similarly, typically the GLX_OML_sync_control doesn't make much
sense on a virtual architecture since we don't really sync to the host's
vertical retrace. We'd like to be able to turn it off as well.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 19:43:54 +02:00
Thomas Hellstrom
ff2978b449 st/dri: Allow dri users to query also driver options
There will be situations where we want to control, for example, the
GLX behaviour based on applications and drivers. So allow DRI users access
to the driver options.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 19:43:54 +02:00
Marek Olšák
7d67cbefe0 radeonsi: clean up decompress blend state names
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 19:38:45 +02:00
Marek Olšák
882c18bf1c gallium/radeon: clean up a misleading statement from the old days
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 19:38:43 +02:00
Marek Olšák
66176e6f14 radeonsi: don't use 1D tiling for Z/S on VI to get TC-compatible HTILE
It's always good to have fewer decompress blits.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 19:38:42 +02:00
Marek Olšák
d2ee423b69 radeonsi: enable TC-compatible stencil compression on VI
Most things are in place. Ideally we won't see decompress blits for stencil
anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 19:38:39 +02:00
Marek Olšák
e003e3c4c0 st/mesa: don't keep framebuffer state in st_context
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:46:21 +02:00
Marek Olšák
f34abf77e9 st/mesa: cache pipe_surface for GL_FRAMEBUFFER_SRGB changes
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:46:21 +02:00
Marek Olšák
f7523f1ef6 st/mesa: use gl_driver_flags::NewFramebufferSRGB
also call st_init_driver_flags when st_context is initialized.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:46:21 +02:00
Marek Olšák
ac0aff7222 mesa: add gl_driver_flags::NewFramebufferSRGB
_NEW_BUFFERS updates too much stuff.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:46:21 +02:00
Marek Olšák
3effce4fb0 radeonsi/gfx9: prevent a race when the previous shader's main part is missing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
b5bc826ead radeonsi/gfx9: wait for main part compilation of 1st shaders of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
ffbaba6072 radeonsi/gfx9: fix LS scratch buffer support without TCS for GFX9
LS is merged into TCS. If there is no TCS, LS is merged into fixed-func
TCS. The problem is the fixed-func TCS was ignored by scratch update
functions, so LS didn't have the scratch buffer set up.

Note that Mesa 17.1 doesn't have merged shaders.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
6e2c07749b radeonsi: move streamout state update out of si_update_shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
294be5279d radeonsi: remove dead code in declare_input_fs
Colors are interpolated in the PS prolog. This was never used.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
8147c4a4a5 radeonsi: move handling of DBG_NO_OPT_VARIANT into si_shader_selector_key
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
86cc809726 radeonsi: use a compiler queue with a low priority for optimized shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
89b6c93ae3 util/u_queue: add an option to set the minimum thread priority
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
6f2947fa79 radeonsi: decrease the number of compiler threads to num CPUs - 1
Reserve one core for other things (like draw calls).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
38bd468a78 radeonsi: drop unfinished shader compilations when destroying shaders
If we enqueue too many jobs and destroy the GL context, it may take
several seconds before the jobs finish. Just drop them instead.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Marek Olšák
33e507ec23 util/u_queue: add a way to remove a job when we just want to destroy it
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:43:42 +02:00
Rob Clark
812fd1aaa8 freedreno/a5xx: set SP_BLEND_CONTROL properly
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Rob Clark
5b60004525 freedreno/a5xx: LRZ support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Rob Clark
313f6360aa freedreno: drop timestamp field
unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Rob Clark
5589ba983d freedreno/a5xx: refactor out helper for LRZ flush
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Rob Clark
e26a7c1cf2 freedreno: reshuffle FD_MESA_DEBUG bitmask
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Rob Clark
613410c8fc freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-07 12:32:00 -04:00
Marek Olšák
a893c91697 gallium/u_blitter: use 2D_ARRAY for cubemap blits if possible
so that we can use TXF.

The cubemap blit pixel shader code size: 148 -> 92 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:10:50 +02:00
Marek Olšák
4a88c7774c gallium/u_blitter: use TXF if possible
This fixes piglit:
    arb_texture_view-rendering-r32ui

TEX (image_sample) flushes denorms to 0 with FP32 textures on GCN, but such
a texture can contain integer data written using an integer render view.
If we do a transfer blit with TEX, denorms are flushed to 0. Luckily,
TXF (image_load) doesn't do that.

TXF also doesn't need to load the sampler state, so blit shaders don't have
to do s_load_dwordx4.

TXF doesn't do CLAMP_TO_EDGE, so it can only be used if the src box is
in bounds, or if we clamp manually (this commit doesn't).

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:10:50 +02:00
Marek Olšák
0604568527 gallium/u_blitter: use TEX_LZ if it's supported
The sampler views always have first_level == last_level.
Now radeonsi doesn't have to use the WQM. (a few SALU removed)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:10:50 +02:00
Marek Olšák
eedca3323e gallium/util: add _LZ and TXF options to simple shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:10:50 +02:00
Marek Olšák
20c2785f7c gallium/ureg: add TEX/TXF_LZ opcodes to ureg
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-07 18:10:50 +02:00
Jason Ekstrand
dd294fd2d9 i965: Use BLORP for all HiZ ops
BLORP has been capable of doing gen8-style HiZ ops for a while now.  We
might as well start using it.  The one downside is that this may cause a
bit more state emission since we still re-emit most things for BLORP.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
bacae7221b blorp: Use FullSurfaceDepthandStencilClear for blorp_hiz_op
The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ
buffer so we can just set the flag unconditionally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
a2152775fd i965: Move the post-HiZ-clear flush/stall to intel_hiz_exec
This also changes it to be predicated so we only do the flush/stall on
clears and HiZ resolves.  The docs only say it's needed for clears but
empirical evidence says it's also needed for HiZ resolves.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
9cb6ac62fb intel/blorp: Plumb through access to the workaround BO
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Nanley Chery
ed5801864e anv/blorp: Move the depth cache flush outside of BLORP
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
fbd8a33f61 intel/blorp: Refactor the HiZ op interface
This commit does a few things:

 1) Now that BLORP can do HiZ ops on gen8+, drop the gen6 prefix.
 2) Switch parameters to uint32_t to match the rest of blorp.
 3) Take a range of layers and loop internally.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
42b10bbfe0 i965/blorp: Inline gen6_blorp_exec
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
acbd02450b i965: Perform HiZ flush/stall prior to HiZ resolves
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
acb9a2ef8f i965: Move the pre-depth-clear flush/stalls to intel_hiz_exec
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
252b004a51 i965/blorp: Take a layer range in intel_hiz_exec
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-07 08:54:54 -07:00
Jason Ekstrand
f9fd976e8a i965/miptree: Store fast clear colors in an isl_color_value
This commit, out of necessity, makes a number of changes at once:

 1) Changes intel_mipmap_tree to store the clear color for both color
    and depth as an isl_color_value.

 2) Changes the depth/stencil emit code to do the format conversion of
    the depth clear value on Haswell and earlier instead of pulling a
    uint32_t directly from the miptree.

 3) Changes ISL's depth/stencil emit code to perform the format
    conversion of the depth clear value on Haswell and earlier instead
    of assuming that the depth value in the float is pre-converted.

 4) Changes blorp to pass the depth value through as a float.

 5) Changes the Vulkan driver to pass the depth value to blorp as a
    float rather than a uint.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 08:54:54 -07:00
Thomas Hellstrom
1253d58983 dri3/GLX: Fix drawable invalidation v2
A number of internal VMware apitrace traces image comparisons fail with
dri3 because the viewport transformation becomes incorrect after an X
drawable resize. The incorrect viewport transformation sometimes persist
until the second draw-call after a swapBuffer.

Comparing with the dri2 glx code there are a couple of places where dri2
invalidates the drawable in the absence of server-triggered invalidation,
where dri3 doesn't do that. When these invalidation points are added to
dri3, the image comparisons become correct.

v2:
Addressed review comment by Michel Dänzer.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-and-tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-06-07 11:23:56 +02:00
Kenneth Graunke
09c3a00f10 i965: Fix alpha to one with dual color blending.
The BLEND_STATE documentation says that alpha to one must be disabled
when dual color blending is enabled.  However, it appears that it simply
fails to override src1 alpha to one.

We can work around this by leaving alpha to one enabled, but overriding
SRC1_ALPHA to ONE and ONE_MINUS_SRC1_ALPHA to ZERO.  This appears to be
what the other driver does, and it looks like it works despite the
documentation saying not to do it.

Fixes spec/ext_framebuffer_multisample/alpha-to-one-dual-src-blend *
Piglit tests.

v2: Add UNUSED to shut up warning on generations which don't use this.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-06-07 02:13:49 -07:00
Samuel Pitoiset
98d5667f4b mesa: add KHR_no_error support for glTexSubImage*D()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:40 +02:00
Samuel Pitoiset
7b104d9c50 mesa: add texsubimage() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:36 +02:00
Samuel Pitoiset
3e34fc0363 mesa: make _mesa_texture_sub_image() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:35 +02:00
Samuel Pitoiset
c2b6a63130 mesa: rename texsubimage() to texsubimage_err()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:34 +02:00
Samuel Pitoiset
287a7a0ca6 mesa: add KHR_no_error support for glCopyImageSubData()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:33 +02:00
Samuel Pitoiset
41df4b1d7e mesa: add copy_image_subdata() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:28 +02:00
Samuel Pitoiset
4485c28e1f mesa: add prepare_target() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:27 +02:00
Samuel Pitoiset
185a79a549 mesa: rename prepare_target() to prepare_target_err()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:25 +02:00
Samuel Pitoiset
6fedb31785 mesa: add KHR_no_error support for glBlitNamedFramebuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:21 +02:00
Samuel Pitoiset
25304a44da mesa: add blit_named_framebuffer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:18 +02:00
Samuel Pitoiset
a9600318ee mesa: add KHR_no_error support for glBlitFramebuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:16 +02:00
Samuel Pitoiset
d496b879ed mesa: add validate_depth_buffer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:14 +02:00
Samuel Pitoiset
f88c367ba9 mesa: add validate_stencil_buffer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:11 +02:00
Samuel Pitoiset
bf0bf23f94 mesa: add validate_color_buffer() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:08 +02:00
Samuel Pitoiset
4f805edd3f mesa: wrap blit_framebuffer() into blit_framebuffer_err()
Also add ALWAYS_INLINE to blit_framebuffer().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:04:01 +02:00
Samuel Pitoiset
cb1d5f4639 mesa: add 'no_error' parameter to blit_framebuffer()
The whole GLES3 block has been moved before the buffer validation
checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:57 +02:00
Samuel Pitoiset
63a60584d1 mesa: make _mesa_blit_framebuffer() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:56 +02:00
Samuel Pitoiset
c231590f8d mesa: add KHR_no_error support for glBindBuffer()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:54 +02:00
Samuel Pitoiset
b019c4e6e8 mesa: add KHR_no_error support for glInvalidateBufferData()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:53 +02:00
Samuel Pitoiset
9ab285e588 mesa: add KHR_no_error support for glInvalidateBufferSubData()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:52 +02:00
Samuel Pitoiset
ec0c2eb845 mesa: add invalidate_buffer_subdata() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:47 +02:00
Samuel Pitoiset
2933ed56ce mesa: add KHR_no_error support for glBindVertexBuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:47 +02:00
Samuel Pitoiset
5da83140df mesa: add KHR_no_error support for glVertexArrayVertexBuffers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:45 +02:00
Samuel Pitoiset
a11c7e3fb5 mesa: add vertex_array_vertex_buffers_err() helper
This also adds a 'no_error' parameter to vertex_array_vertex_buffer()
to be used in a following patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 11:03:40 +02:00
Samuel Pitoiset
f075c2bc0b mesa: add KHR_no_error support for glScissor*()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:09:19 +02:00
Samuel Pitoiset
e2524a21cb mesa: add scissor() and scissor_array() helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:09:17 +02:00
Samuel Pitoiset
80ae5c128d mesa: rename ScissorIndexed() to scissor_indexed_err()
And move GET_CURRENT_CONTEXT() into the APIENTRY calls
for consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:09:13 +02:00
Samuel Pitoiset
e8de0e124f mesa: use _mesa_set_scissor() in ScissorIndexed()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:09:11 +02:00
Samuel Pitoiset
ee38dfe9a5 mesa: make _mesa_scissor_bounding_box() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:09:04 +02:00
Samuel Pitoiset
8614f31be2 mesa: inline update_image_transfer_state() into _mesa_update_pixel()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:06:05 +02:00
Samuel Pitoiset
51854def8a mesa: remove useless check in _mesa_update_pixel()
The only caller is _mesa_update_state_locked() which already
checks if _NEW_PIXEL is set before calling _mesa_update_pixel().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-07 09:05:25 +02:00
Iago Toral Quiroga
f7741985be glcpp: fix #undef to match latest spec update and GLSLang implementation
GLSL ES spec includes the following:

   "It is an error to undefine or to redefine a built-in
    (pre-defined) macro name."

But desktop GLSL doesn't. This has sparked some discussion
in Khronos, and the final conclusion was to update the
GLSL 4.50 spec to include the following:

   "By convention, all macro names containing two consecutive
    underscores ( __ ) are reserved for use by underlying
    software layers.  Defining or undefining such a name in a
    shader does not itself result in an error, but may result
    in unintended behaviors that stem from having multiple
    definitions of the same name.  All macro names prefixed
    with “GL_” (“GL” followed by a single underscore) are also
    reserved, and defining or undefining such a name results in
    a compile-time error."

In other words, undefining GL_* names should be an error, but
undefining other names with a double underscore in them is
not strictly prohibited in desktop GLSL.

This patch fixes the preprocessor to apply these rules,
following exactly the implementation already present
in GLSLang. This fixes some tests in CTS.

Khronos bug:
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16003

Fixes:
KHR-GL45.shaders.preprocessor.definitions.undefine_core_profile_vertex
KHR-GL45.shaders.preprocessor.definitions.undefine_core_profile_fragment

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-07 07:50:46 +02:00
Dave Airlie
1ec4f008a2 ac/nir: move gpr counting inside argument handling.
This just moves this code in here to it's cleaner.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 06:00:30 +01:00
Dave Airlie
7b46e2a74b ac/nir: assign argument param pointers in one place.
Instead of having the fragile code to do a second pass, just
give the pointers you want params in to the initial code,
then call a later pass to assign them.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 06:00:23 +01:00
Dave Airlie
b19cafd441 ac/nir: consolidate setting userdata location
Just pass a pointer and increment inside the function,
makes the code less error prone.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 05:59:57 +01:00
Timothy Arceri
d0cec1fce1 glthread: remove extra _mesa_glthread_finish() from generated code
The other user of print_sync_dispatch() was ending up with code that
looked like:

      _mesa_glthread_finish(ctx);
      _mesa_glthread_restore_dispatch(ctx);
      _mesa_glthread_finish(ctx);

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-07 14:53:38 +10:00
Anuj Phogat
8d02916e0c intel: Fix broxton 2x6 way size computation
This patch is undoing the changes to way size computation
in broxton 2x6, made by below commit:

Commit: 0d576fbfbe
Author:     Anuj Phogat <anuj.phogat@gmail.com>
i965: Simplify l3 way size computations

By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 21:30:51 -07:00
Dave Airlie
86eff151b1 radv: move chip_class extraction down further.
This seems to matter here in a profile, without this we spend a lot
more time exiting this function with no flush bits.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:25:20 +10:00
Dave Airlie
00fe30f376 radv: move lots of index related things into the bind.
This just moves lots of stuff to the bind stage rather than
dealing with it in the draw stage.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:37 +10:00
Dave Airlie
734ea16bdb radv: move calculating the vertex sgpr to the pipeline.
There is no need to calculate this at draw time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:36 +10:00
Dave Airlie
3f48021b86 radv: rename and make global some functions.
I want to use these in the pipeline setup stage.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:36 +10:00
Eric Engestrom
63a8a88ac4 tree-wide: remove trailing backslash
Simple search for a backslash followed by two newlines.
If one of the newlines were to be removed, this would cause issues, so
let's just remove these trailing backslashes.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-07 01:18:09 +01:00
Dave Airlie
f0b82bc545 radv/gfx9: use correct register setting for uconfig regs
Thanks to Marek for pointing this out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 08:09:03 +10:00
Bas Nieuwenhuizen
59c2e2a061 radv: Remove SI num RB override for occlusion queries.
radeonsi doesn't have it anymore either.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen
d607b83b79 radv: Split out updating the vertex descriptors.
Simple refactor.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen
58c8aae241 radv: Move pipeline stuff from flush_state to emit_graphics_pipeline.
No functional changes.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen
e08f741678 radv: Add early exit for cache flushes.
No sense checking each bit separately in the common case of none
being set.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen
4ec89727b2 radv: Remove vertex_descriptors_dirty.
Redundant.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen
fe0b8d1e8b radv: Don't use a divide by index_size.
Divides are pretty slow, and this is in the hot path of a draw.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Chris Wilson
7063696b71 i965: Explicitly disallow tiled memcpy path on Gen4 with swizzling.
The manual detiling paths are not prepared to handle Gen4-G45 with
swizzling enabled, so explicitly disable them.  (They're already
disabled because these platforms don't have LLC but a future patch could
enable this path).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:56 -07:00
Matt Turner
bc17155fd0 i965: Remove brw_bo_map_unsynchronized()
Call brw_bo_map() directly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:47 -07:00
Matt Turner
e0a9b261e5 i965: Use unsynchronized mappings for BufferSubData on non-LLC
Now that unsynchronized maps actually work, we can use them, like we do
on LLC platforms.

On Broxton, the performance of Unigine Valley 1.1-rc1 is improved by
37.6656% +/- 0.401389% (n=20) at 1280x720/QUALITY_LOW, and by
20.862% +/- 2.20901% (n=3) at 1920x1080/QUALITY_LOW.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:47 -07:00
Matt Turner
a16355d67d i965: Make unsynchronized maps unsynchronized on non-LLC
On Broxton, the performance of Unigine Valley 1.0 is improved by
13.3067% +/- 0.144322% (n=40) at 1280x720/QUALITY_LOW, and by
1.68478% +/- 0.484226% (n=3) at 1920x1080/QUALITY_LOW.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:47 -07:00
Matt Turner
ce17d4c5f5 i965: Implement brw_bo_map_unsynchronized() with MAP_ASYNC
This way we can let brw_bo_map() choose the best mapping type.

Part of the patch inlines map_gtt() into brw_bo_map_gtt() (and removes
map_gtt()). brw_bo_map_gtt() just wrapped map_gtt() with locking and a
call to set_domain(). map_gtt() is called by brw_bo_map_unsynchronized()
to avoid the call to set_domain(). With the MAP_ASYNC flag, we now have
the same behavior previously provided by brw_bo_map_unsynchronized().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
68bfc377fb i965: Elide call to set_domain() if MAP_ASYNC
No functional change (no callers currently pass MAP_ASYNC)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
2120cfe1af i965: Add and use brw_bo_map()
We can encapsulate the logic for choosing the mapping type. This will
also help when we add WC mappings.

A few functional changes are made in this patch. On non-LLC, what were
previously WB mappings are now GTT mappings (in the prefilling debug
code in brw_performance_query.c; the shader_time code in brw_program.c;
and in the case of an RW mapping in intel_buffer_objects.c).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
dcb03bf18d i965: Drop MAP_READ from some write-only mappings
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
275401d32b i965: Pass flags to brw_bo_map_*
brw_bo_map_cpu() took a write_enable arg, but it wasn't always clear
whether we were also planning to read from the buffer. I kept everything
semantically identical by passing only MAP_READ or MAP_READ | MAP_WRITE
depending on the write_enable argument.

The other flags are not used yet, but MAP_ASYNC for instance, will be
used in a later patch to remove the need for a separate
brw_bo_map_unsynchronized() function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
925a4222f2 i965: Rename brw_bo_map() -> brw_bo_map_cpu()
I'm going to make a new function named brw_bo_map() in a later patch
that is responsible for choosing the mapping type, so this patch clears
the way.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
3d1530d3e8 i965: Rename *_virtual -> map_*
I think these are better names, and it reduces the delta between
upstream and Chris Wilson's brw-batch branch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Chris Wilson
6aa2e8777b i965: Pass the map-mode along to intel_mipmap_tree_map_raw()
Since we can distinguish when mapping between READ and WRITE, we can
pass along the map mode to avoid stalls and flushes where possible.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-06 11:47:46 -07:00
Matt Turner
47bb498534 i965: Add a cache_coherent field to brw_bo
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
51d714dca6 i965: Remove unused 'use_resource_streamer' field
Missing in the resource streamer removal of commit 951f56cd43.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
5dc35e1664 i965: Remove brw_bo's virtual member
Just return the map from brw_map_bo_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Matt Turner
d7024a6b3c i965: Remove unused brw_bo_map__* functions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-06 11:47:46 -07:00
Alex Smith
922b038864 anv: Set better descriptor set limits
Based on discussions with Jason, Ivy Bridge and Bay Trail only actually
support 16 samplers, while newer hardware can support more than the
current limit of 64. Therefore set the lower limit where needed, and
bump up to 128 for everything else. There is also a limit on the total
number of other resources of around 250.

This allows Dawn of War III to render correctly on ANV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:09 -07:00
Alex Smith
59c1797d56 anv: Set driver version to Mesa version
As already done by RADV.

v2: Move version calculation function to src/vulkan/util to share with
    RADV.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:20:00 -07:00
Alex Smith
dc6182fa3f radv/vulkan: Move radv_get_driver_version to src/vulkan/util
This means it can be reused for other Vulkan drivers. Also fix up a
typo, need to search for '.' in the version string rather than ','.

v2: Remove unneeded temporary version variable (Emil, Eric)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:19:55 -07:00
Alex Smith
621b3410f5 util/vulkan: Move Vulkan utilities to src/vulkan/util
We have Vulkan utilities in both src/util and src/vulkan/util. The
latter seems a more appropriate place for Vulkan-specific things, so
move them there.

v2: Android build system changes (from Tapani Pälli)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-06 08:17:13 -07:00
Lionel Landwerlin
2ef73473c8 intel: gen-decoder: rework how we handle groups
The current way of handling groups doesn't seem to be able to handle
MI_LOAD_REGISTER_* with more than one register. This change reworks
the way we handle groups by building a traversal list on loading the
GENXML files.

Let's say you have

Instruction {
  Field0
  Field1
  Field2
  Group0 (count=2) {
    Field0-0
    Field0-1
  }
  Group1 (count=4) {
    Field1-0
    Field1-1
  }
}

We build of linked on load that goes :

Instruction -> Group0 -> Group1

All of those are gen_group structures, making the traversal trivial.
We just need to iterate groups for the right number of timers (count
field in genxml).

The more fancy case is when you have only a single group of unknown
size (count=0). In that case we keep on reading that group for as long
as we're within the DWordLength of that instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-06 14:04:37 +01:00
Marek Olšák
6c655cfeb4 radeonsi: fix a GPU hang with tessellation on 2-CU configs
Only harvested Stoney has 2 CUs. Tested on 2-CU Stoney and Fiji forced
to 2 CUs.

Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-06-06 13:01:52 +02:00
Samuel Pitoiset
b9f9bad4eb mesa: make use of NewWindowRectangles driver flags
Now, st_update_window_rectangles() won't be called when the
scissor is going to be updated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:31 +02:00
Samuel Pitoiset
035b0176e2 mesa: add new gl_driver_flags::NewWindowRectangles
This new driver flag will replace _NEW_SCISSOR which is
emitted when setting new window rectangles but it actually
triggers useless changes in the state tracker (like scissor
and rasterizer).

EXT_window_rectangles is currently only supported by Nouveau.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:27 +02:00
Samuel Pitoiset
d19d8f5e6b mesa: remove call to Driver.Scissor() in _mesa_WindowRectanglesEXT()
This is actually useless because this driver call is only used
by the classic DRI drivers which don't support that extension
and probably won't never support it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:24 +02:00
Samuel Pitoiset
11c6aab239 mesa: only emit _NEW_MULTISAMPLE when min sample shading changes
We usually check that given parameters are different before
updating the state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:22 +02:00
Samuel Pitoiset
af9e537be3 mesa: only emit _NEW_MULTISAMPLE when sample mask changes
We usually check that given parameters are different before
updating the state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:19 +02:00
Samuel Pitoiset
706e31fe5a mesa: only emit _NEW_MULTISAMPLE when coverage parameters change
We usually check that given parameters are different before
updating the state.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-06 11:47:16 +02:00
Kenneth Graunke
9cd69022d5 i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.
We moved to INTEL_SCALAR_* when we added more than a single stage, but
never went back and converted the VS to work that way.  Be consistent.

Also update the documentation to actually mention these debug variables.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-05 23:32:40 -07:00
Dave Airlie
2890a71158 radv: expose integrated device type for APUs.
This just sets the vulkan device type depending on whether
this is an APU or GPU.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-06-06 12:48:57 +10:00
Bas Nieuwenhuizen
ecdace80f4 ac/surface: Fix HTILE for radv.
We always compute HTILE size using addrlib, even when not TC compatible.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlied <airlied@redhat.com>
2017-06-06 03:17:02 +02:00
Dave Airlie
0e72dea46f radv: fix write event eop on vega.
Typo here, fixes command submission hangs on vega
2017-06-06 10:43:19 +10:00
Dave Airlie
65477bae9c radv: enable GFX9 on radv
I'm open to reverting this closer to release if bad things
happen, but it might be easier to debugging to leave it for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:26 +10:00
Dave Airlie
c07eb1823f radv: turn off geom/tess for gfx9.
We don't support these yet, and it'll take a bit of work to do so.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:18 +10:00
Dave Airlie
348f63623b radv: misc GFX9 changes.
These are just some register changes ported from radeonsi for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:10 +10:00
Dave Airlie
289de9f945 radv: add some GFX9 specific events.
These are ported from radeonsi, don't know all the rules for
when they should be inserted.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:00 +10:00
Dave Airlie
5c8f8cae3e radv: add IA_MULTI_VGT_PARAM support for GFX9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:55 +10:00
Dave Airlie
67655cb24f radv: add rb+ support for GFX9
This adds some rb+ support, as on GFX9 we have to disable
it as per radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:45 +10:00
Dave Airlie
c2fbeb7ca0 radv: add GFX9 cache flushing support.
GFX9 needs to write event EOP to a fence buffer, allocate some
space for this, and just write an ever increasing number to it,
this isn't exactly what radeonsi does, but it seems to work.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:40 +10:00
Dave Airlie
b11c4a5546 radv: add texture descriptor/fmask/cmask support for GFX9
This adds gfx9 support for the texture descriptor along
with the fmask/cmask allocation routines.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:37 +10:00
Dave Airlie
87b3799493 radv: add GFX9 to initialisation cmd buffer.
This just adds support for initialising some GFX9 registers,
and handles the different init for the VGT reuse reg.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:35 +10:00
Dave Airlie
98f27b9cce radv: don't setup raster_config on gfx9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:32 +10:00
Dave Airlie
77b8aa4d95 radv: add gfx9 cp dma support.
This adds support to the CP dma code for GFX9, ported from
radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:29 +10:00
Dave Airlie
41eba750ba radv: add gfx9 depth/stencil surface support.
This is ported from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:27 +10:00
Dave Airlie
ac3e18916f radv: add GFX9 support for color surfaces.
This is ported from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:24 +10:00
Dave Airlie
0063da8393 radv: add some misc gfx9 pieces.
This just adds the strings and includes the gfx9 register defs
in some files that we need them in.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:21 +10:00
Dave Airlie
a83f28d536 radv: set offchip hs param like radeonsi.
radeonsi never uses 512 here anymore.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:18 +10:00
Dave Airlie
04924c09be radv: fix typo in comment. 2017-06-06 08:59:30 +10:00
Dave Airlie
114d29e7fe radv: add a comment from radeonsi before cp dma function.
This is just copied over.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:44:01 +10:00
Dave Airlie
da3330662f radv: remove doubled up prototype.
Must have snuck in during a rebase.
2017-06-06 08:27:35 +10:00
Dave Airlie
d1a4d229ec radv: split metadata struct into legacy/gfx9 parts.
This is just ported from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:22:45 +10:00
Dave Airlie
d987f90354 radv: refactor some texture descriptor state.
This just splits out some non-gfx9 bits in advance to avoid
regressions.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:22:42 +10:00
Dave Airlie
a5d181f60b radv: refactor color surface init before gfx9.
This just moves the code around in preparation for gfx9 support.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:22:38 +10:00
Dave Airlie
d3ab239099 radv: refactor depth/stencil state setup
In advance of GFX9 to reduce chances for regression, refactor
this code out so adding the GFX9 changes will be more obvious.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:22:33 +10:00
Dave Airlie
b50ab49723 radv: use radv_foreach_stage in a couple of places.
This just collapses a few per-stage things into a loop,
shouldn't affect anything.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:20:22 +10:00
Emil Velikov
065dea70e5 radeon: remove out of date LLVM_REVISION.txt
The file was introduced to track which LLVM revision was required, yet
that has quickly gone out of shape.

It has seen no updates since 2013.

Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Aaron Watry <awatry@gmail.com>
2017-06-05 17:12:36 -05:00
Juan A. Suarez Romero
d1df9a595e docs: update calendar, add news item and link release notes for 17.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-06-05 21:22:15 +00:00
Juan A. Suarez Romero
3255b9d348 docs: add sha256 checksums for 17.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 4908b1e909)
2017-06-05 21:22:15 +00:00
Juan A. Suarez Romero
373c309c24 docs: add release notes for 17.1.2
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 97f6404e50)
2017-06-05 21:22:15 +00:00
Brian Paul
af4017665b gallium/u_threaded: fixes for MSVC
Replace some static assertions with runtime assertions.  The static
asserts don't work/fail on MSVC, despite the offsets being multiples
of 16 (checked with softpipe).

Use correct parameter types for a few gallium context functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-05 15:06:15 -06:00
Dave Airlie
d8212f847a r600: refactor out some compressed resource state code.
This just takes this out to a separate function as it will
get more complex with images.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-06-06 06:09:44 +10:00
Dave Airlie
7a26a0bf09 r600: document some of the missing shader constants.
These are used for fragment shader thread calculations.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-06-06 06:09:41 +10:00
Dave Airlie
95c1e57a18 r600: add register info for atomic counters.
The atomic counters on evergreen are implemented via append/consume
UAV counters. This just adds the register info for them. The EOS
packets are used to get the atomic totals extracted post shader
execution for storing into a buffer.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-06-06 06:09:37 +10:00
Dave Airlie
a6b71f7588 r600: add missing RAT registers and operations.
This just documents in the headers the RAT operation list,
and the RAT encoding for exports.

The immediate registers are used to point to buffers for the
RAT return values (_RTN instructions).

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-06-06 06:09:10 +10:00
Dave Airlie
e119c445da r600/sb: fix typo in field definitions
Pointed out by glennk.
2017-06-06 05:46:14 +10:00
Marek Olšák
4b1e6ed49a tgsi/scan: fix scanning fragment shaders with PrimID and Position/Face
Not relevant to radeonsi, because Position/Face are system values
with radeonsi, while this codepath is for drivers where Position and
Face are ordinary inputs.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-06-05 18:29:42 +02:00
Jason Ekstrand
708664159e i965: Finalize miptrees before prepare_texture
In order to do resolves for texture views with different formats, we
need intel_texture_object::_Format to be valid.  Calling
intel_finalize_mipmap_tree can safely be done multiple times in a row
and should be a fairly cheap operation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-05 09:26:22 -07:00
Marek Olšák
9275b2233f gallium/u_threaded: remove 16 bytes from tc_batch
All other sentinels occupy what is otherwise unused space.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-05 18:25:57 +02:00
Marek Olšák
3b1ce49bc1 gallium/u_threaded: align batches and call slots to 16 bytes
not sure if this helps

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-05 18:25:57 +02:00
Marek Olšák
2ec50f98a9 st/mesa: don't load cached TGSI shaders on demand
This fixes a performance issue with the shader cache that delayed Gallium
shader create calls until draw calls.

I'd like this in stable, but it's not a showstopper.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-05 18:25:57 +02:00
Chih-Wei Huang
bb0452442a Android: use bionic pthread_barrier_* if possible
The pthread_barrier_* functions were introduced to bionic
since Nougat.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-05 14:06:35 +01:00
Dave Airlie
06f4251925 r600: fix incorrect and missing bit field in register headers.
The compression field was incorrect, and we were missing the
depth before shader field.
2017-06-05 13:19:18 +10:00
Nicolai Hähnle
df30123794 radv: use ac_compute_surface
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:30 +10:00
Dave Airlie
607e61c40e radv: prepare fmask surface creation
The old code copied over all the surface info from the image
surface, we only want some bits of it, and to modify the flags.

This prevents a regression in dEQP-VK.api.copy_and_blit.resolve_image.*
and others in the subsequent switch to ac_compute_surface.

v2:
- also disable opt4Space in radv_amdgpu_surface, so that we can
  apply this patch separately *before* switching to ac_compute_surface
  and hopefully avoid intermittent regressions (Nicolai)

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-05 10:44:24 +10:00
Nicolai Hähnle
8354f287db radv: use amdgpu_addr_create
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:22 +10:00
Nicolai Hähnle
40e94847a5 radv: stop using radv_amdgpu_winsys::family
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:18 +10:00
Nicolai Hähnle
bd4493b169 radv: use ac_gpu_info
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:15 +10:00
Nicolai Hähnle
eeb075d662 radv: remove radeon_info::name
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:13 +10:00
Nicolai Hähnle
dfc06d2fac radv: use ac_surface data structures
This is mostly mechanical changes of renaming types and introducing
"legacy" everywhere.

It doesn't use the ac_surface computation functions yet.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:09 +10:00
Nicolai Hähnle
543de22f4b radv: rename radeon_surf::bo_{size,alignment} to surf_{size,alignment}
To match radeonsi / ac_surface.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:05 +10:00
Nicolai Hähnle
8417c21d0a radv: remove unused RADEON_SURF_HAS_SBUFFER_MIPTREE
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:44:02 +10:00
Nicolai Hähnle
e156eaedb4 radv: remove radeon_surf_level::nblk_z
We're not using thick tiling modes, so we can just derive the value
ourselves.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:59 +10:00
Nicolai Hähnle
34b7fb47b6 radv: remove radeon_surf_level::dcc_enabled
Like radeonsi; replace with radeon_surf::num_dcc_levels.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:56 +10:00
Nicolai Hähnle
59f72e158a radv: remove radeon_surf_level::pitch_bytes
Like radeonsi. This saves memory, and the information can easily be
recomputed on the fly where necessary.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:53 +10:00
Nicolai Hähnle
a12d288bff radv: add surface helper variable in radv_GetImageSubresourceLayout
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:50 +10:00
Nicolai Hähnle
388d36dfd1 radv: fewer than 8 RBs are possible
This fixes the subsequent assertion on Bonaire.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:47 +10:00
Nicolai Hähnle
e07d5c7296 ac/surface/gfx6: explicitly support S8 surfaces
This is needed by radv for dEQP-VK.renderpass.simple.stencil

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 10:43:29 +10:00
Dave Airlie
72f0830ecd ac/nir: set workgroup size attribute to correct value.
This ports: 55445ff189 from radeonsi

    radeonsi: tell LLVM not to remove s_barrier instructions

    LLVM 5.0 removes s_barrier instructions if the max-work-group-size
    attribute is not set. What a surprise.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-05 01:37:44 +01:00
Dave Airlie
68c812f699 ac: add new helper function to add a integer target dependent function attr.
This is needed to add the max workgroup size attribute.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-05 01:37:29 +01:00
Dave Airlie
4ba2e6cbfa radv: add external memory support.
This adds support for exporting 2D images, to an
opaque fd.

This implements the:
VK_KHX_external_memory_capabilities
VK_KHX_external_memory
VK_KHX_external_memory_fd

extensions.

These are used by SteamVR, we should work with anv
to decide if we should ship these under an env
var or something.

v2 (Bas): - Don't expose the semaphore ext without implementing it.
          - Only export the capabilities ext as instance ext.
          - Implement radv_GetPhysicalDeviceExternalBufferPropertiesKHX.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
2017-06-05 02:26:43 +02:00
Bas Nieuwenhuizen
d515b420dd radv: Add VkPhysicalDeviceIDProperties support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 02:26:43 +02:00
Bas Nieuwenhuizen
d513473cc1 radv: Add support for external queue family.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-05 02:26:43 +02:00
Dave Airlie
a935cd926b radv/formats: reverse how the image format properties KHR2 is handled
This just aligns with how anv does it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-05 01:03:30 +01:00
Bas Nieuwenhuizen
4415a46be2 radv: Dirty all descriptors sets when changing the pipeline.
Sets could have been ignored during previous descriptor set flush
due to the shader not using them and therefore no SGPR being assigned.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8 "radv: move userdata sgpr ownership to compiler side."
2017-06-03 22:24:37 +02:00
Bas Nieuwenhuizen
5fb8bb3065 radv: Set both compute and graphics SGPRS on descriptor set flush.
We clear the descriptors_dirty array afterwards, so the SGPRs for
the other pipeline don't get updated on the flush for that other
draw/dispatch, so we have to make sure we do it immediately.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8 "radv: move userdata sgpr ownership to compiler side."
2017-06-03 22:24:37 +02:00
Chris Wilson
8d07cb125c i965: Order write of query availablity with earlier writes
Currently we signal the availabilty of the query result using an
unordered pipe-control write. As it is unordered, it may be executed
before the write of the query result itself - and so an observer may
read the query result too early. Fix this by requesting that the write
of the availablity flag is ordered after earlier pipe control writes.

Testcase: piglit/arb_query_buffer_object-qbo/*async*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-06-03 13:38:45 +01:00
Lyude
98fc0243ef nvc0: Add support for ARB_post_depth_coverage
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-02 23:19:42 -04:00
Lyude
4dafc4c99a st/mesa: Add support for ARB_post_depth_coverage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-02 23:19:39 -04:00
Lyude
467af445a3 gallium: Add a cap to check if the driver supports ARB_post_depth_coverage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-02 23:19:22 -04:00
Lyude
af788a82d5 gallium: Add TGSI shader token for ARB_post_depth_coverage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-06-02 23:19:22 -04:00
Lyude
245912b684 nvc0: disable BGRA8 images on Fermi
BGRA8 image stores on Fermi don't work, which results in breaking
PBO downloads, such that they always return 0x0. Discovered this
through a glamor bug, and confirmed it does indeed break a good number
of piglit tests such as spec/arb_pixel_buffer_object/pbo-read-argb8888

Fixes: 8e7893eb53 ("nvc0: add support for BGRA8 images")
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-06-02 23:10:36 -04:00
Anuj Phogat
0d576fbfbe i965: Simplify l3 way size computations
By making use of l3_banks field in gen_device_info struct
l3_way_size for gen7+ = 2 * l3_banks.

V2: Keep the get_l3_way_size() function.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Anuj Phogat
eb23be1d97 i965: Add and initialize l3_banks field for gen7+
This new field helps simplify l3 way size computations
in next patch.

V2: Initialize the l3_banks to 0 in macros.

Suggested-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-06-02 16:21:56 -07:00
Chad Versace
e9f5004d5e i965: Replace 0 with ISL_FORMAT_UNSUPPORTED in format table (v2)
When given an *unsupported* mesa_format,
brw_isl_format_for_mesa_format() returned 0, a *valid* isl_format,
ISL_FORMAT_R32G32B32A32_FLOAT.  The problem is that
brw_isl_format_for_mesa_format's inner table used 0 instead of
ISL_FORMAT_UNSUPPORTED to indicate unsupported mesa formats.

Some callers of brw_isl_format_for_mesa_format() were aware of this
weirdness, and worked around it. This patch removes those workarounds.

v2: Ensure that all array elements are initialized to
  ISL_FORMAT_UNSUPPORTED, even when new formats are added to enum
  mesa_format, by using an designated range initializer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-02 12:41:30 -07:00
Gurchetan Singh
1fec049850 st/dri: Use fence extension in drisw.c
This is desirable for synchronization in virtual machines.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-02 12:33:42 -07:00
Gurchetan Singh
59dc23bba9 st/dri: move fence implemention into separate file
Since the fence implementation is not dri2.c specific, put
it in a separate file. This way SW implementations can use this
extension too.

v2: Don't depend on dri2.c for extensions (Emil)
v3: Make this patch only move extension into a separate file (Chad).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-02 12:33:21 -07:00
Brian Paul
3ba5b8a560 mesa: document range of SampleCoverageValue, MinSampleShadingValue
Trivial.
2017-06-02 08:23:13 -06:00
Brian Paul
c6ba85a8c0 xlib: fix glXGetCurrentDisplay() failure
glXGetCurrentDisplay() has been broken for years and nobody noticed until
recently.  This change adds a new XMesaGetCurrentDisplay() that the GLX
emulation API can call, just as we did for glXGetCurrentContext().

Tested by hacking glxgears to call glXGetCurrentContext() before and
after glXMakeCurrent() to verify the return value is NULL beforehand and
the same as the opened display afterward.

Also tested by Tom Hudson with his tests programs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100988
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Tom Hudson <tom.hudson.phd@gmail.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2017-06-02 08:22:55 -06:00
Dave Airlie
bcae327469 radv: realign cp dma code with radeonsi
This reworks this code to be like radeonsi, which will make it
easier to add GFX9 support to it in the future.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:49:11 +10:00
Dave Airlie
745aa17093 radv: bump some base addresses to 64-bits.
For GFX9 these will be needed to be 64-bit, so bump them early,
to avoid it causing any wierdness later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:49:01 +10:00
Dave Airlie
ad61eac250 radv: factor out eop event writing code. (v2)
In prep for GFX9 refactor some of the eop event writing code
out.

This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.

v2: bump the size checks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:56 +10:00
Dave Airlie
7205431e73 radv: factor out si_emit_wait_fence code.
This code was in a few places, consolidate into one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:20 +10:00
Jason Ekstrand
1a22c4c960 intel/blorp: Handle gen6 stencil/HiZ offsets in the back-end
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:34:01 -07:00
Jason Ekstrand
d065a9540c intel/isl: Add a helper for getting the byte/tile offset of a subimage
Frequently, get_image_offset_sa is combined with get_intratile_offset_sa
so it makes sense to have a single helper to do both.  If the caller
doesn't want the intratile offsets, it can simply pass NULL and ISL will
assert that they are 0.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:58 -07:00
Jason Ekstrand
b178762d05 intel/isl: Make get_intratile_offset_el take the element size in bits
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:56 -07:00
Jason Ekstrand
757f7087a5 intel/isl: Add a new layout for HiZ and stencil on Sandy Bridge
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:47 -07:00
Jason Ekstrand
cb8cdab8e8 intel/isl: Generate phys_total_el from isl_calc_phys_extent
The only surface layout for which slice0 makes any sense is GEN4_2D.
Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d
and make the others trivially return the total size in surface elements.
As a side-effect, array_pitch_el_rows is now returned from these helpers
as well.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:45 -07:00
Jason Ekstrand
918f41bb29 intel/isl: Don't check array pitch for gen4 3D textures
Array pitch doesn't matter in this layout.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:43 -07:00
Jason Ekstrand
044bfb292f intel/isl: Refactor to use a phys_total_el extent.
We've already implicitly been using a physical total size in surface
elements.  This just centralizes things a bit.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:41 -07:00
Jason Ekstrand
1547d133ac intel/isl: Add an isl_assert_div helper
This is a fairly common operation and it's nice to be able to just call
the one little function.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:39 -07:00
Jason Ekstrand
58051ad220 intel/isl: Refactor isl_calc_array_pitch_el_rows
Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway
so we can just handle the other two as special cases at the top.  The
two "generic" cases below the switch only apply on gen9 and above and
only to 3D or CCS surfaces.  This implies that they only apply to
surfaces with ISL_DIM_LAYOUT_GEN4_2D.  Making them look generic is a
lie.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:37 -07:00
Jason Ekstrand
fe13c59c1b intel/isl: Move isl_calc_array_pitch_el_rows higher up
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:34 -07:00
Jason Ekstrand
c1a70165be intel/isl: Remove the device parameter from isl_tiling_get_info
We were only using it for validating that we don't use Ys/Yf on gen8 and
earlier.  Removing it from isl_tiling_get_info lets us remove it from a
bunch of other things that had no business needing a hardware
generation.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:31 -07:00
Jason Ekstrand
10903d2289 i965: Rework Sandy Bridge HiZ and stencil layouts
Sandy Bridge does not technically support mipmapped depth/stencil.  In
order to work around this, we allocate what are effectively completely
separate images for each miplevel, ensure that they are page-aligned,
and manually offset to them.  Prior to layered rendering, this was a
simple matter of setting a large enough halign/valign.

With the advent of layered rendering, however, things got more
complicated.  Now, things weren't as simple as just handing a surface
off to the hardware.  Any miplevel of a normally mipmapped surface can
be considered as just an array surface given the right qpitch.  However,
the hardware gives us no capability to specify qpitch so this won't
work.  Instead, the chosen solution was to use a new "all slices at each
LOD" layout which laid things out as a mipmap of arrays rather than an
array of mipmaps.  This way you can easily offset to any of the
miplevels and each is a valid array.

Unfortunately, the "all slices at each lod" concept missed one
fundamental thing about SNB HiZ and stencil hardware:  It doesn't just
always act as if you're always working with a non-mipmapped surface, it
acts as if you're always working on a non-mipmapped surface of the same
size as LOD0.  In other words, even though it may only write the
upper-left corner of each array slice, the qpitch for the array is for a
surface the size of LOD0 of the depth surface.  This mistake causes us
to under-allocate HiZ and stencil in some cases and also to accidentally
allow different miplevels to overlap.  Sadly, piglit test coverage
didn't quite catch this until I started making changes to the resolve
code that caused additional HiZ resolves in certain tests.

This commit switches Sandy Bridge HiZ and stencil over to a new scheme
that lays out the non-zero miplevels horizontally below LOD0.  This way
they can all have the same qpitch without interfering with each other.
Technically, the miplevels still overlap, but things are spaced out
enough that each page is only in the "written area" of one LOD.

Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-06-01 15:33:26 -07:00
Kenneth Graunke
fe14a9a501 i965: Drop duplicate shadow variable.
We already initialized this at the top of the function.

Trivial.
2017-06-01 14:28:12 -07:00
Jose Fonseca
ce5e83b8a0 automake: Link all libGL.so variants with -Bsymbolic.
We were linking src/glx with -Bsymbolic, but not the classic/gallium X11
libGL.so.

But it's always a good idea to build all libGL.so and all DRI drivers
with -Bsymbolic, otherwise they might resolve symbols from the 3rd party
application executable or shared libraries, which is _never_ what we
want.

In particular, this can happen when intercepting OpenGL calls with
apitrace, before
63194b2573

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-01 21:24:38 +01:00
Chad Versace
9d996e94fb i965/dri: Fix bad GL error in intel_create_winsys_renderbuffer()
This function never occurs in the callchain of a GL function. It occurs
only in the callchain of eglCreate*Surface and the analogous paths for
GLX.  Therefore, even if a  thread does have a bound GL context,
emitting a GL error here is wrong. A misplaced GL error, when no GL
call is made, can confuse clients.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-01 12:41:32 -07:00
Chad Versace
a23cabd8ca i965: Cleanup in intel_create_winsys_renderbuffer()
Combine variable declarations and assignments.
Trivial cleanup.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-01 12:41:30 -07:00
Chad Versace
6551655ffd i965: Remove bad assert on isl_format
translate_tex_format() asserted that isl_format != 0. But 0 is a valid
format, ISL_FORMAT_R32G32B32A32_FLOAT.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-01 12:41:26 -07:00
Chad Versace
de69002faa i965: Fix return type of translate_tex_format()
It returns an isl_format, not GLuint BRW_FORMAT.  I updated every
translate_tex_format() found by git-grep.

No change in behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-01 12:41:24 -07:00
Chad Versace
77e3c836f8 i965: Fix return type of brw_isl_format_for_mesa_format() [v2]
It returns an isl_format, not uint32_t BRW_FORMAT.
I updated every brw_isl_format_for_mesa_format() found by git-grep.

No change in behavior.

v2: Rebased atop Anuj's patch, which has some of the same fixes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2017-06-01 12:39:35 -07:00
Anuj Phogat
84ede214fc i965: Remove an extra semicolon
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-06-01 12:14:58 -07:00
Anuj Phogat
adb449694a i965: Rename brw_format variable names to isl_format
This patch makes non functional changes. Renaming is just to
make the code more readable.

V2: update the types to "enum isl_format"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-01 12:14:13 -07:00
Chad Versace
7a4964ec5c i965: Reject unsupported formats in glEGLImageTargetTexture2D()
If the EGLImage's format is not a supported texture format according to
brw_surface_formats.c, then refuse to create the miptree. This follows
the precedent in glEGLImageRenderbufferStorage (implemented by
intel_image_target_renderbuffer_storage), which rejects the EGLImage's
format if is not renderable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-06-01 12:03:33 -07:00
Kenneth Graunke
fe9699dcb4 genxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.
This will let us initialize the constant buffers with loops.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke
12303bd390 genxml: Fix decoder to print the array element on field members.
Previously we'd print things like:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength: 0
       ReadLength: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength: 1
       ReadLength: 0

instead of the more obvious:

   0xfffbb568:  0x00010000 : Dword 1
       ReadLength[0]: 0
       ReadLength[1]: 1
   0xfffbb568:  0x00000001 : Dword 1
       ReadLength[2]: 1
       ReadLength[3]: 0

(Yes, the ralloc context here is bogus - the decoder leaks just about
everything.  We need to use proper ralloc contexts someday...)

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:46 -07:00
Kenneth Graunke
73c21e69d0 genxml: Fix decoding of array groups.
If you had a group as the first element of a struct, i.e.

  <struct name="3DSTATE_CONSTANT_BODY" length="10">
    <group count="4" start="0" size="16">
      <field name="ReadLength" start="0" end="15" type="uint"/>
    </group>
    ...
  </struct>

we would get a group_offset of 0, causing create_field() to think the
field wasn't in a group, and fail to offset forward for successive array
elements.  So we'd mark all the array elements as offset 0.

Using ctx->group->elem_size is a better check for "are we in a group?".

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke
d1b949282f genxml: Fix decoder for groups with multiple fields.
If you have something like:

    <group count="0" start="96" size="32">
      <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/>
      <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/>
    </group>

We would reset ctx->group_count to 0 after processing the first field,
so the second would not have a group count.

This is largely untested, as the only groups with multiple fields are
packets we don't emit in Mesa.  Found by inspection.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Kenneth Graunke
df2d55ba57 genxml: Fix parsing of address fields in groups.
For example,

    <group count="4" start="64" size="64">
      <field name="Pointer" start="5" end="63" type="address"/>
    </group>

used to generate:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer, 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer, 0);
   ...

but now generates code with proper subscripts:

   const uint64_t v2_address =
      __gen_combine_address(data, &dw[2], values->Pointer[0], 0);
   ...
   const uint64_t v4_address =
      __gen_combine_address(data, &dw[4], values->Pointer[1], 0);
   ...

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-06-01 11:49:45 -07:00
Eric Engestrom
845d07978f configure.ac: simplify --enable-libunwind=auto check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-01 16:56:57 +01:00
Nicolas Dechesne
adadadc151 util/rand_xor: add missing include statements
Fixes for:

src/util/rand_xor.c:60:13: error: implicit declaration of function 'open' [-Werror=implicit-function-declaration]
    int fd = open("/dev/urandom", O_RDONLY);
             ^~~~
src/util/rand_xor.c:60:34: error: 'O_RDONLY' undeclared (first use in this function)
    int fd = open("/dev/urandom", O_RDONLY);
                                  ^~~~~~~~

Signed-off-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-06-01 14:26:12 +01:00
Lucas Stach
cab5996c26 etnaviv: always do cpu_fini in transfer_unmap
The cpu_fini() call pushes the buffer back into the GPU domain, which needs
to be done for all buffers, not just the ones with CPU written content. The
etnaviv kernel driver currently doesn't validate this, but may start to do
so at a later point in time. If there is a temporary resource the fini needs
to happen before the RS uses this one as the source for the upload.

Also remove an invalid comment about flushing CPU caches, cpu_fini takes
care of everything involved in this.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-01 15:20:38 +02:00
Emil Velikov
72011f7a7b docs: update calendar, add news item and link release notes for 17.0.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-06-01 11:46:39 +01:00
Emil Velikov
0fd1715be1 docs: add sha256 checksums for 17.0.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bdfd5658e7)
2017-06-01 11:42:46 +01:00
Emil Velikov
29c6a1200b docs: add release notes for 17.0.7
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 46cc7a1746)
2017-06-01 11:42:45 +01:00
Samuel Pitoiset
1da51ec0f7 glsl: fix a crash in ir_print_visitor() for bindless samplers/images
Bindless samplers/images are represented with 64-bit unsigned
integers and they can be assigned with explicit constructors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-01 11:54:06 +02:00
Samuel Pitoiset
e4e5562d8a glsl: teach opt_array_splitting about bindless images
Memory/format layout qualifiers shouldn't be lost when arrays
of images are splitted by this pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-01 11:54:06 +02:00
Samuel Pitoiset
678e05cc34 glsl: teach opt_structure_splitting about images in structures
GL_ARB_bindless_texture allows images to be declared inside
structures, but when memory/format qualifiers are used, they
should be propagated when structures are splitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-01 11:54:06 +02:00
Samuel Pitoiset
71efec290c glsl: fix broken indentation in do_structure_splitting()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-01 11:54:06 +02:00
Samuel Pitoiset
ad717102d9 glsl: handle format layout qualifiers for struct with array of images
This handles a situation like:

struct {
   layout (r32f) image2D imgs[6];
} s;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-01 11:54:06 +02:00
Samuel Pitoiset
d9460ad600 glsl: handle memory qualifiers for struct with array of images
This handles a situation like:

struct {
   image2D imgs[6];
} s;

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-06-01 11:54:06 +02:00
Rhys Kidd
e305400443 nvc0: Clean up unnecessary includes from gallium/auxiliary/vl/
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-01 10:16:14 +02:00
Kenneth Graunke
6d60121fa0 i965: Simplify SO_DECL handling.
We can initialize structs directly, avoid some temporaries, and cut out
about half of the skip component handling.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-01 00:08:29 -07:00
Kenneth Graunke
9a690ada94 i965: Make a local for linked_xfb->Outputs[i], to shorten things.
This seems a bit more readable.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-01 00:08:29 -07:00
Kenneth Graunke
65f5f3c85c i965: Move SOL PSIZ hacks from draw time to link time.
We can just update the gl_transform_feedback_info fields at link time
to make the VUE header fields have the right location and component.
Then we don't need to handle them specially at draw time, which is
expensive.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-06-01 00:08:29 -07:00
Iago Toral Quiroga
3d37cf99c8 mesa/main: replace remaining uses of IROUND() in GetUniform*() by round()
These were correct since they were used only in conversions to signed integers,
however this makes the implementation a bit more is more consistent and reduces
chances of propagating use of these macros to unsigned cases in the future, which
would not be correct.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Iago Toral Quiroga
1356b42284 mesa/main: conversion from float in GetUniformi64v requires rounding to nearest
As we do for all other cases of float/double conversions to integers.

v2: use round() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Iago Toral Quiroga
c333082483 mesa/main: Add conversion from double to uint64/int64 in GetUniform*i64v()
v2:
  - need unsigned rounding for double->uint64 conversion (Nicolai)
  - use round() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Iago Toral Quiroga
cc972c2845 mesa/main: Clamp GetUniformui64v values to be >= 0
Like we do for the 32-bit case.

v2:
  - need unsigned rounding for float->uint64 conversion (Nicolai)
  - use roundf() instead of IROUND() macros (Iago)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Kenneth Graunke
83e74d7dc1 mesa/main: Clamp GetUniformuiv values to be >= 0
Section 2.2.2 (Data Conversions For State Query Commands) of the
OpenGL 4.5 October 24th 2016 specification says:

"If a command returning unsigned integer data is called, such as
 GetSamplerParameterIuiv, negative values are clamped to zero."

v2: uint to int conversion should clamp to INT_MAX (Nicolai)

v3 (Iago)
  - Add conversions conversions from 64-bit integer paths
  - Rebase on master

v4:
  - need unsigned rounding for float/double->uint conversions (Nicolai)
  - use round{f}() instead of IROUND() macros (Iago)

Fixes:
KHR-GL45.gpu_shader_fp64.state_query

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Iago Toral Quiroga
1020448700 mesa/main: fix indentation in _mesa_get_uniform()
v2: also change the style of the large conditional in that function
    to follow the style from most other parts of Mesa (Nicolai)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-06-01 08:44:34 +02:00
Ian Romanick
779b35bbc6 r100: Silence numerous unused this or that warnings
radeon_fbo.c: In function ‘radeon_map_renderbuffer_s8z24’:
radeon_fbo.c:147:50: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 radeon_map_renderbuffer_s8z24(struct gl_context *ctx,
                                                  ^~~
radeon_fbo.c: In function ‘radeon_map_renderbuffer_z16’:
radeon_fbo.c:186:48: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 radeon_map_renderbuffer_z16(struct gl_context *ctx,
                                                ^~~
radeon_fbo.c: In function ‘radeon_unmap_renderbuffer_s8z24’:
radeon_fbo.c:344:52: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 radeon_unmap_renderbuffer_s8z24(struct gl_context *ctx,
                                                    ^~~
radeon_fbo.c: In function ‘radeon_unmap_renderbuffer_z16’:
radeon_fbo.c:377:50: warning: unused parameter ‘ctx’ [-Wunused-parameter]
 radeon_unmap_renderbuffer_z16(struct gl_context *ctx,
                                                  ^~~
radeon_fbo.c: In function ‘radeon_nop_alloc_storage’:
radeon_fbo.c:624:75: warning: unused parameter ‘rb’ [-Wunused-parameter]
 radeon_nop_alloc_storage(struct gl_context * ctx, struct gl_renderbuffer *rb,
                                                                           ^~
radeon_fbo.c:625:12: warning: unused parameter ‘internalFormat’ [-Wunused-parameter]
     GLenum internalFormat, GLuint width, GLuint height)
            ^~~~~~~~~~~~~~
radeon_fbo.c:625:35: warning: unused parameter ‘width’ [-Wunused-parameter]
     GLenum internalFormat, GLuint width, GLuint height)
                                   ^~~~~
radeon_fbo.c:625:49: warning: unused parameter ‘height’ [-Wunused-parameter]
     GLenum internalFormat, GLuint width, GLuint height)
                                                 ^~~~~~
radeon_fbo.c: In function ‘radeon_bind_framebuffer’:
radeon_fbo.c:696:74: warning: unused parameter ‘fbread’ [-Wunused-parameter]
                        struct gl_framebuffer *fb, struct gl_framebuffer *fbread)
                                                                          ^~~~~~
radeon_fbo.c: In function ‘radeon_validate_framebuffer’:
radeon_fbo.c:832:19: warning: unused variable ‘radeon’ [-Wunused-variable]
  radeonContextPtr radeon = RADEON_CONTEXT(ctx);
                   ^~~~~~

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-31 21:14:44 -07:00
Ian Romanick
303b47f253 r100: Use _mesa_get_format_base_format in radeon_update_wrapper
The wrapper is for a renderbuffer around a texture.  Textures can have
formats (e.g., 3) that aren't valide for API generated renderbuffers.
_mesa_base_fbo_format will return 0, but _mesa_get_format_base_format
will return the base format of RGB.

Fixes a crashes in piglit tests fbo-alphatest-formats (all subtests
pass) and fbo-colormask-formats (some subtests pass, some fail).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-31 21:14:44 -07:00
Ian Romanick
c24881d39c r100,r200: Don't assume glVisual is non-NULL during context creation
Thanks to EGL_MESA_configless_context, the visual pointer can be NULL.

Fixes a segfault (or assertion failure) in piglit's
egl-configless-context test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-31 21:14:44 -07:00
Ian Romanick
2dcec62075 r100: Don't assume that the base mipmap of a texture exists
Fixes crashes in piglit's gl-1.2-texture-base-level.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-31 21:14:44 -07:00
Dave Airlie
f42fb0012a r600/eg: add support for tracing IBs after a hang.
This is a poor man's version of radeonsi ddebug stuff, this
should get hooked into that infrastructure, and grow more stuff,
but for now, just create R600_TRACE var that points to a file
that you want to dump the last IB to.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-01 11:20:11 +10:00
Dave Airlie
55d1550d35 glsl/lower_int64: only set progress when something is lowered.
Otherwise we'd get progress continually set if we had non 64-bit
versions of these ops.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-01 08:46:35 +10:00
Bas Nieuwenhuizen
af2844116f radv: Revert HTILE reset word to 0xFFFFFFFF.
0x30f regressed mad max.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Fixes: df91abfe5a "radv: Use correct clear words for HTILE."
2017-05-31 23:55:13 +02:00
Rob Herring
e8f82bfd52 Android: major/minor/makedev live in <sys/sysmacros.h>
sysmacros.h was getting implicitly included in types.h until recently in
AOSP master. Define MAJOR_IN_SYSMACROS to explicitly include sysmacros.h.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-31 16:35:25 -05:00
Chad Versace
22d6b08d2d egl/android: Drop unused 'format' param in get_back_bo()
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-31 10:45:57 -07:00
Chad Versace
0bcdcebc85 egl/android: Align channel masks in HAL_PIXEL_FORMAT table
Improves readability. No change in behavior.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-31 10:45:57 -07:00
Eric Engestrom
11da77e546 egl/drm: remove temporary fd variable
In all codepaths, this var ends up assigned to the struct, except one:
a cleanup codepath, where the `close()` was removed, leading to fd leaks.
Remove the temp fd and assign to the struct field directly instead.

CovID: 1213930
Fixes: 7ec07beedf ("egl/drm: make use of the
                              dri2_display_destroy() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-31 18:09:27 +01:00
Samuel Pitoiset
c222fa9ada mesa: throw an INVALID_OPERATION error in get_texobj_by_name()
Because get_texobj_by_name() can already throw a INVALID_ENUM
error, it makes more sense to add a check directly there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-31 12:01:19 +02:00
Samuel Pitoiset
b9c3ce529f mesa: add new 'name' parameter to get_texobj_by_name()
To display better function names when INVALID_OPERATION is
returned. Requested by Timothy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-31 12:01:08 +02:00
Samuel Pitoiset
30a4e375f5 radeonsi: remove unused si_pm4_state::compute_pkt
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-31 09:20:57 +02:00
Samuel Pitoiset
e4b05a50df radeonsi: remove chip_class define from si_pm4.h
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-31 09:20:55 +02:00
Samuel Pitoiset
d90a6c2f23 radeonsi: merge si_pm4_free_state_simple() into si_pm4_free_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-31 09:20:53 +02:00
Samuel Pitoiset
d8debc6aad mesa/util: fix arithmetic use of 'void *' in u_vector_foreach
u_vector_foreach is currently only used by the Intel Vulkan
driver but when this macro is used in mesa core, GCC reports
a compile-time error. Probably because some compiler options
are different.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-31 09:19:54 +02:00
Timothy Arceri
4e93da30f0 mesa: remove _mesa from static function names
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-31 11:18:17 +10:00
Timothy Arceri
42fea3622f mesa/st: indentation tidy-up
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-31 11:18:17 +10:00
Rob Clark
45e97c994b freedreno/a5xx: drop WFIs in emit_marker5()
Results in always having at least one WFI between draws, which was
slowing stk down by ~5% and ~10% in xonotic.

(also drop bogus assert while we're at it.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-30 20:40:58 -04:00
Rob Clark
76214b9919 freedreno/a5xx: timestamp / time-elapsed queries
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-30 20:40:58 -04:00
Rob Clark
5ed9e8fd5d freedreno/a5xx: rename query result struct
Going to want the same thing for timestamp queries.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-30 20:40:58 -04:00
Rob Clark
8c65f17c3b freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-30 20:40:58 -04:00
Kenneth Graunke
236ffbc442 i965: Delete dead old-school packing structs.
Trivial.
2017-05-30 16:22:33 -07:00
Tim Rowley
c606edb578 swr/rast: code cleanup (no functional change)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:22:18 -05:00
Tim Rowley
b10c9507ce swr/rast: whitespace changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:22:12 -05:00
Tim Rowley
ac9d7c3d33 swr/rast: code cleanup (no functional change)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:22:08 -05:00
Tim Rowley
e9e999ae32 swr/rast: allow early-z if shader uses depth value
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:22:02 -05:00
Tim Rowley
628fefc15c swr/rast: move wireframe/point triangle binning after culling
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:57 -05:00
Tim Rowley
3b76dea5d1 swr/rast: remove unused functions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:52 -05:00
Tim Rowley
d91402fefa swr/rast: code cleanup (no functional change)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:47 -05:00
Tim Rowley
7e271a763e swr/rast: move binner utility functions to binner.h
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:41 -05:00
Tim Rowley
5ea9a30f50 swr/rast: SIMD16 FE - fix/use SIMD16 calcDeterminantIntVertical()
Stop double pumping the SIMD8 version.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:36 -05:00
Tim Rowley
fb9f7bd717 swr/rast: add renderTargetArrayIndex to SWR_PS_CONTEXT
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:30 -05:00
Tim Rowley
2438932b7e swr/rast: make simd16 logicops avx512f safe
Express the simd16 logicops in terms of avx512f instructions.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:22 -05:00
Tim Rowley
7be26a2d35 swr/rast: SIMD16 FE - add SIMD16 types to jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:18 -05:00
Tim Rowley
e3c93d8ddf swr/rast: SIMD16 FE - fix PA_STATE_OP::Reset()
Fixes instanced GS.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:12 -05:00
Tim Rowley
fd14c40734 swr/rast: SIMD16 FE - simplify/refactor StreamOut
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:07 -05:00
Tim Rowley
a230af8b44 swr/rast: SIMD16 FE - fix conservative rasterization
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:21:02 -05:00
Tim Rowley
f64aea0959 swr/rast: SIMD16 FE - interleaved simdvertex output in GS
Eliminates conversion copies on GS output from simdvertex to simd16vertex.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:56 -05:00
Tim Rowley
cbd33e71f7 swr/rast: fix _simd16_movemask_(ps,pd) native AVX512 intrinsics
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:51 -05:00
Tim Rowley
9fd68be133 swr/rast: SIMD16 FE - primitive assembly simplification
Reduce/simplify vertex storage usage in PA_STATE_OPT, fix PA
GetNextVSOutput wrap-around behaviour and eliminate unnecessary
SIMDVERTEX copies/storage for tri fan in PA_STATE_OPT

Fixes the OpenGL tri fan test failure under SIMD16 -
triangle-rasterization-overdraw.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:44 -05:00
Tim Rowley
4c23523365 swr/rast: silence write of cfg graph
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:39 -05:00
Tim Rowley
7e35777624 swr/rast: add CreateDirectoryPath to recursively create directories
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:33 -05:00
Tim Rowley
f094d582ec swr/rast: add support for DX1_RGB{_SRGB} formats
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:27 -05:00
Tim Rowley
42b4e7cb25 swr/rast: clean up whitespace
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:21 -05:00
Tim Rowley
5d542b3204 swr/rast: adjust BinPostSetupPoints* function signature
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:20:15 -05:00
Tim Rowley
b714208415 swr/rast: remove extra pixel center adjustment in BinPostSetupPoints
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-30 17:19:51 -05:00
Kenneth Graunke
56535959fd anv: Port over CACHE_MODE_1 optimization fix enables from brw.
Ben and I haven't observed these to help anything, but they enable
hardware optimizations for particular cases.  It's probably best to
enable them ahead of time, before we run into such a case.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke
53368b008e genxml: Add Gen9 CACHE_MODE_1 definitons.
These were already in gen8.xml but not gen9.xml.  There are a few new
fields and a couple that have changed.  These are all documented in the
Skylake PRM, Volume 2c Command Reference: Registers, Part 1.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Kenneth Graunke
a8fde221a8 i965: Set the "Float Blend Optimization Enable" bit on Gen9+.
This is woefully undocumented.  It's some kind of optimization that
avoids unnecessary render target reads when blending with a floating
point render target, using independent alpha blending modes.

The internal documentation indicates that this bit exists on Cherryview
as well, but the other driver doesn't appear to set it on that platform.
There's also some confusing wording that indicates that it may exist on
Broadwell, but the documentation says it's reserved, so who knows.

I was not able to find any workload that benefited from setting this
bit, but it seems like a good idea to set it nonetheless.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-30 14:59:31 -07:00
Chad Versace
9601b41a33 i965: Fix type of brw_context::render_target_format[]
It's an array of isl_format, not uint32_t. This patch updates every
reference to render_target_format[] git-grep.

Trivial cleanup. No change in behavior.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:38 -07:00
Chad Versace
6e325f1203 i965: Move func to right comment block in brw_context.h
brw_init_surface_formats() is defined in brw_surface_formats.c, not
brw_wm_surface_state.c.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:38 -07:00
Chad Versace
f5702230e0 i965: Document type of GLuint __DRIimage::format
It's either a mesa_format or mesa_array_format.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
da042d951c i965: Add whitespace in intel_update_image_buffers()
Improve readability.  Add an empty line between two large 'if' blocks.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
b86e079ab7 i965: Move an 'i' declaration into its 'for' loop
In intel_update_dri2_buffers().
Trivial cleanup.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
a90a15d638 i965: Fix type of intel_update_image_buffers::format
It's a mesa_format, not an unsigned int.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
77a1eefa3c i965: Rename intel_create_renderbuffer
The name is misleading because the function is unrelated to GL
renderbuffers. Rename it to intel_create_winsys_renderbuffer.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
e8a0a5d7f9 i965/dri: Combine declaration and assignment in intelCreateBuffer
Trivial cleanup.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Chad Versace
85dd3e4de1 i965/dri: Rewrite comment for intelCreateBuffer
The old comment pinned this function to X11 windows. In reality, this
function serves more than X11 and more than just windows.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-30 12:01:37 -07:00
Bartosz Tomczyk
fd6c2a3f3e mesa: Avoid leaking surface in st_renderbuffer_delete
v2: add comment in code

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100741
Fixes: a5e733c6b5 mesa: drop current draw/read buffer when ctx is released
Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-30 14:48:32 +01:00
Varad Gautam
4c412293d0 egl: advertise EGL_EXT_image_dma_buf_import_modifiers
v2: check for DRIimageExtension version 15 (Jason Ekstrand)

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-30 13:56:20 +01:00
Varad Gautam
de3c459bbd egl: implement eglQueryDmaBufModifiersEXT
query and return supported dmabuf format modifiers for
EGL_EXT_image_dma_buf_import_modifiers.

v2: move format check to the driver instead of making format queries
   here and then checking.
v3: Check DRIimageExtension version before query (Daniel Stone)
v4:
- move to DRIimageExtension version 15, check queryDmaBufModifiers before
  calling (Jason Ekstrand)
- pass external_only to the driver instead of setting as EGL_TRUE here
  (Emil Velikov, Daniel Stone)

Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-30 13:56:20 +01:00
Varad Gautam
6719e058d6 egl: implement eglQueryDmaBufFormatsEXT
allow egl clients to query the dmabuf formats supported on this platform.

v2: return EGLBoolean.
v3: Check DRIimageExtension version before querying (Daniel Stone).
v4: move to DRIimageExtension version 15, error checking (Jason Ekstrand).

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-30 13:56:20 +01:00
Varad Gautam
6f10e7c37a egl/dri2: Create EGLImages with dmabuf modifiers
Allow creating EGLImages with dmabuf format modifiers when target is
EGL_LINUX_DMA_BUF_EXT for EGL_EXT_image_dma_buf_import_modifiers.

v2:
- clear modifier assembling and error label name (Eric Engestrom)
v3:
- remove goto jumps within switch-case (Emil Velikov)
- treat zero as valid modifier (Daniel Stone)
- ensure same modifier across all dmabuf planes (Emil Velikov)
v4:
- allow modifiers to add extra planes (Louis-Francis Ratté-Boulianne)
v5:
- fix error checking, some cleanups (Jason Ekstrand)
- pass single copy of the modifier to createImageFromDmaBufs2

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-30 13:56:20 +01:00
Varad Gautam
c5929634a0 dri: introduce dmabuf format modifier related handles
these allow dmabuf import with modifiers, and supported format and
modifier queries, which are used to implement
EGL_EXT_image_dma_buf_import_modifiers.

v2:
- squash dmabuf queries into DRIimage version 15 (Jason Ekstrand).
- add external_only param to queryDmaBufModifiers (Emil, Daniel Stone)
- pass a single modifier form createImageFromDmaBufs2 since all planes have
the same modifier (Jason Ekstrand)

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-30 13:56:20 +01:00
Pekka Paalanen
fb2a1c2327 egl/main: add support for fourth plane tokens
The EGL_EXT_dma_buf_import_modifiers extension adds support for a
fourth plane, just like DRM KMS API does.

Bump maximum dma_buf plane count to four.

v2: prevent attribute tokens from being parsed if
    EXT_image_dma_buf_import_modifiers is not suported. (Emil Velikov)

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-05-30 13:56:20 +01:00
Pekka Paalanen
9434f057c8 egl: introduce DMA_BUF_MAX_PLANES
Rather than hardcoding 3, use a #define. Makes it easier to bump this
later to 4.

Signed-off-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Signed-off-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-05-30 13:56:20 +01:00
Alexandre Courbot
76aa1bbb89 nvc0: support for GP10B
GP10B uses the same 3D class as GP100.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-30 08:27:00 -04:00
Tomeu Vizoso
106b2786b6 etnaviv: Don't try to use the index buffer if size is zero
If info->index_size is zero, info->index will point to uninitialized
memory.

Fatal signal 11 (SIGSEGV), code 2, fault addr 0xab5d07a3 in tid 20456 (surfaceflinger)

lst: Remove useless indexbuf conditional in the index_size != 0 case.

Fixes: 330d0607ed ("gallium: remove pipe_index_buffer and set_index_buffer")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-05-30 11:45:10 +02:00
Kenneth Graunke
d529d5ff16 i965: Always scissor on Gen4-5 instead of disabling guardband.
See commit ece0e535a4.  This makes
Gen4-5 follow the behavior we use on Gen6+.  It seems to have
worked out there.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:48 -07:00
Kenneth Graunke
70be2a96a5 i965: Unify Gen4-5 and Gen6 SF_VIEWPORT/CLIP_VIEWPORT code.
This brings the improved guardbanding we implemented on Gen6+
back to the older Gen4-5 code.  It also deletes piles of code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:46 -07:00
Kenneth Graunke
01cb6cd473 i965: Make a set_scissor_bits helper function.
Gen4-5 include a single SCISSOR_RECT in SF_VIEWPORT.

Making a helper function will allow us to reuse this code for Gen4-5.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:43 -07:00
Kenneth Graunke
55862ed477 i965: Use GENX(packet_length) rather than hardcoded dword counts.
This is clearer and less likely to break in the future.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:42 -07:00
Kenneth Graunke
c6b623f601 i965: Move the scissoring code up near the viewport code.
These are fairly related.  Gen4-5 combine the scissor rectangle and
SF_VIEWPORT.  Co-locating them will allow me to avoid forward
declarations of helper functions in a few patches.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:40 -07:00
Kenneth Graunke
9afe5846d2 genxml: Make a SCISSOR_RECT structure on Gen4-5.
Gen6+ support multiple scissor rectangles, and define a SCISSOR_RECT
structure containing their dimensions.  On Gen4-5, those same fields
exist in SF_VIEWPORT.

This patch extracts the SF_VIEWPORT fields into a SCISSOR_RECT
structure.  Although not a named concept on Gen4-5, it works just
as well, and gives us a consistent SCISSOR_RECT structure across
all generations, making it easier to reuse code.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:37 -07:00
Kenneth Graunke
44309dcea3 i965: Replace brw->gen and devinfo->gen with GEN_GEN.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:36 -07:00
Kenneth Graunke
4ce103e01a i965: Rework Sandybridge 3DSTATE_VIEWPORT_STATE_POINTERS.
On Gen7+ we emit 3DSTATE_VIEWPORT_STATE_POINTERS_{SF_CL,CC} when
emitting a new viewport.

This patch makes us take the same approach on Sandybridge - but because
we have a combined command, we just set the appropriate "change" bits.
This eliminates an atom, some dirty flagging, and some brw->*.vp_offset
writes.  It does mean we'll emit two 3DSTATE_VIEWPORT_STATE_POINTERS
instead of one if both change, but that's probably fine.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:33 -07:00
Kenneth Graunke
7f4645e89c i965: Port CC_VIEWPORT to genxml.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-29 21:46:30 -07:00
Kenneth Graunke
1e3880544e i965: Ignore INTEL_SCALAR_* debug variables on Gen10+.
Scalar mode has been default since Broadwell, and vector mode is getting
increasingly unmaintained.  There are a few things that don't even fully
work in vector mode on Skylake, but we've never cared because nobody
uses it.  There's no point in porting it forward to new platforms.

So, just ignore the debug options to force it on.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-29 21:40:44 -07:00
Timothy Arceri
2c2ea573e5 mesa: add KHR_no_error support for glBindBufferRange()
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
b8174a837f mesa: create bind_buffer_range() helper
This will help us add KHR_no_error support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
3eb6d34dfc mesa: convert mesa_bind_buffer_range_transform_feedback() to a validate function
This allows some tidy up and also makes it so we can add KHR_no_error
support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
863b19ae21 mesa: create _mesa_bind_buffer_range_xfb() helper
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
21d9376e71 mesa: split bind_atomic_buffer() in two
This will help us add KHR_no_error support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
135e5659bd mesa: split bind_buffer_range_shader_storage_buffer() in two
This will help us implement KHR_no_error support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
cea384fa75 mesa: split bind_buffer_range_uniform_buffer() in two
This will help us implement KHR_no_error support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
85e891283c mesa: add KHR_no_error support for glVertexArrayVertexBuffer()
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
9d331739ae mesa: add KHR_no_error support for glBindVertexBuffer()
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Timothy Arceri
9db595e0de mesa: split vertex_array_vertex_buffer() in two
This will allow us to skip the error checkes when adding
KHR_no_error support.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-30 08:03:32 +10:00
Bas Nieuwenhuizen
18efb404cf radv: Reserve space for descriptor and push constant user SGPR setting.
flush_compute_state doesn't reserve a large chunk, so these need their own reservation.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-05-29 22:30:39 +02:00
Leo Liu
ea79c0440c amd/common: set vcn dec as hw decode as well
Recommit after issue resolved by the previous patch.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-29 14:32:29 -04:00
Leo Liu
0abc24723c amd/common: add vcn dec ip info query for amdgpu version 3.17
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-29 14:32:29 -04:00
Gregory Hainaut
79f0fe655d glthread/gallium: require safe_glthread to start glthread
Print an error message for the user if the requirement isn't met, or
we're not thread safe.

v2: based on Nicolai feedbacks
Check the DRI extension version

v3: based on Emil feedbacks
improve commit and error messages.
use backgroundCallable variable to improve readability

v5: based on Emil feedbacks
Properly check the function pointer

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 17:07:04 +01:00
Gregory Hainaut
3fde8db53a egl: implement __DRIbackgroundCallableExtension.isThreadSafe
v2:
bump version

v3:
Add code comment
s/IsGlThread/IsThread/ (and variation)
Include X11/Xlibint.h protected by ifdef

v5: based on Daniel feedback
Move non X11 code outside of X11 define
Always return true for Wayland

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 17:06:57 +01:00
Gregory Hainaut
63b78c939b glx: implement __DRIbackgroundCallableExtension.isThreadSafe
v2:
bump version

v3:
Add code comment
s/IsGlThread/IsThread/ (and variation)

v4:
DRI3 doesn't hit X through GL call so it is always safe

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 17:06:49 +01:00
Gregory Hainaut
fa84f6225b dri: Extend __DRIbackgroundCallableExtensionRec to include a callback that checks for thread safety
DRI-drivers could call Xlib functions, for example to allocate a new back
buffer.

When glthread is enabled, the driver runs mostly on a separate thread.
Therefore we need to guarantee the thread safety between libX11 calls
from the applications (not aware of the extra thread) and the ones from
the driver.

See discussion thread:
   https://lists.freedesktop.org/archives/mesa-dev/2017-April/152547.html

Fortunately, Xlib allows to lock display to ensure thread safety but
XInitThreads must be called first by the application to initialize the lock
function pointer. This patch will allow to check XInitThreads was called
to allow glthread on GLX or EGL platform.

Note: a tentative was done to port libX11 code to XCB but it didn't solve fully
thread safety.
See discussion thread:
   https://lists.freedesktop.org/archives/mesa-dev/2017-April/153137.html

Note: Nvidia forces the driver to call XInitThreads. Quoting their manpage:
"The NVIDIA OpenGL driver will automatically attempt to enable Xlib
thread-safe mode if needed. However, it might not be possible in some
situations, such as when the NVIDIA OpenGL driver library is dynamically
loaded after Xlib has been loaded and initialized. If that is the case,
threaded optimizations will stay disabled unless the application is
modified to call XInitThreads() before initializing Xlib or to link
directly against the NVIDIA OpenGL driver library. Alternatively, using
the LD_PRELOAD environment variable to include the NVIDIA OpenGL driver
library should also achieve the desired result."

v2: based on Nicolai and Matt feedback
Use C style comment

v3: based on Emil feedback
split the patch in 3
s/isGlThreadSafe/isThreadSafe/

v5: based on Marek comment
Add a comment that isThreadSafe is supported by extension v2

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 17:06:37 +01:00
Emil Velikov
5cb16e07ab egl/wayland: use the image_driver alongside the image_loader
Analogous to earlier commits - image_driver and image_loader are meant
to be used hand in hand.

v2: Rebase

Cc: Derek Foreman <derekf@osg.samsung.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:47 +01:00
Emil Velikov
429d56693d egl/wayland: set the resize_callback if the flush extension is available
Strictly speaking __DRI_DRI2 implies __DRI2_FLUSH. Although since we're
using the latter in the callback, we want to use the correct guard.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:46 +01:00
Emil Velikov
6ef0fc400c egl/wayland: select the format based on the interface used
Rather than misleadingly depending on DRI2 for the WL_DRM vs WL_SHM
formats, use the wl_drm and wl_shm interface respectively.

Fixes: a1727aa75e ("egl/wayland: Don't use DRM format codes for SHM")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:45 +01:00
Emil Velikov
d6ecd1647f egl/surfaceless: use the image_driver for image_loader
Analogous to previous commit.

Cc: Chad Versace <chadversary@chromium.org>
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:41 +01:00
Emil Velikov
acf125ed3a egl/android: use the image_driver alongside the image_loader
They are meant to be used together. Otherwise we'll need workarounds
like egl/wayland. Namely register an image_loader_extension even thought
we should be using only DRI2.

v2: Add missing the bracket to fix the build (Tapani).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-29 16:59:39 +01:00
Emil Velikov
6b46854269 egl/x11: flatten codeflow
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:38 +01:00
Emil Velikov
14e51d526f egl/x11: check for dri2_dpy->flush before using the flush extension
Analogous to earlier commit.

Note that the dri2_x11_post_sub_buffer and dri2_x11_swap_buffers_region
paths already implicitly require __DRI2_FLUSH. The corresponding
extensions (NV_post_sub_buffer and NOK_swap_region) are enabled only
with DRI2.

v2: Split cosmetic changes into separate patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:20 +01:00
Emil Velikov
1398ece02c egl/drm: flatten codeflow
Rework the code to return early and drop an indentation level.
It should be easier to read.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:19 +01:00
Emil Velikov
4db5e83227 egl/drm: check for dri2_dpy->flush before using the flush extension
The current __DRI_DRI2 imples __DRI2_FLUSH. At the same time, one can
use __DRI_IMAGE_DRIVER alongside the latter, so the current check is
confusing at best.

Check for what we use.

v2: Split out from whitespace changes

Reviewed-by: Chad Versace <chadversary@chromium.org> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-29 16:59:16 +01:00
Emil Velikov
79d1fb95ee egl: annotate dri2_egl_display_vtbl as const data
With the final place that modifies the vtbl removed as of last commit we
can annotate the symbols accordingly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-29 16:59:15 +01:00
Emil Velikov
83a792cf25 egl/wayland: don't modify the vtbl if an extension is not available
With previous commit we'll error out should one be using the extension
when it's not available. Thus we no longer need to modify the vtbl.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-29 16:59:14 +01:00
Emil Velikov
701311425e egl: error out on eglCreateWaylandBufferFromImageWL
Currently f one does the silly thing by probing the entry point w/o
checking the extension they will attempt to use the extension even
though it cannot work.
That is due our of of an assert which gets removed in release builds.

Simply error out if the extension is not enabled. Thus we can
apply some cleanups with next commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-29 16:59:12 +01:00
Emil Velikov
46cc022d5d gbm: manage only the required set of DRI extensions
Currently GBM attempts to know all the extensions that might be required
by EGL/DRM [at some later stage].

That is a bit unclear and we often forget to update GBM as EGL gets
attention.

To avoid that, simply let EGL manage it's own required extensions based
on the base primitive (screen) we provide it.

v2: Rework the approach - GBM should not dive into EGL/DRM.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:12 +01:00
Emil Velikov
90d0ad14ca egl/drm: use dri2_setup_extensions() over the extensions provided by GBM
Allows us to keep things in sync easier and lets us simplify the
interface between the two even further.

v2: Don't set GBM's extensions.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:09 +01:00
Emil Velikov
2c341f2bda egl: refactor dri2_create_screen() into three separate functions
Split the create_screen into:
 - create screen
 - setup/bind extensions
 - setup screen

This will allow us to reuse the latter two on egl/drm. Said platform
does create its own screen and attempts to reinvent the later two
functions itself.

Since the GBM ones tend to get out of sync quite often, and there is no
distinct reason why it does so we'll drop them with latter commits.

v2: disp -> dpy for the Android platform.
v3: use correct goto label (Rob)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:06 +01:00
Emil Velikov
ee3b32696f egl/x11: make use of the dri2_display_destroy() helper
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:04 +01:00
Emil Velikov
a0163f9284 egl/wayland: make use of the dri2_display_destroy() helper
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:02 +01:00
Emil Velikov
c8d366bab2 egl/surfaceless: make use of the dri2_display_destroy() helper
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:50:00 +01:00
Emil Velikov
7ec07beedf egl/drm: make use of the dri2_display_destroy() helper
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:58 +01:00
Emil Velikov
898d7858f8 egl/android: make use of dri2_display_destroy() helper
v2: disp -> dpy (Tapani)

Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:55 +01:00
Emil Velikov
3e73c0245b egl: split out a dri2_display_destroy() helper
Within dri2_display_release() we already tear down all the display
specifics. Within the platform specific dri initialize however we badly
and partially duplicate that.

Let's stop that by fleshing out the required functionality into a helper
and using it throughout the codebase.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:52 +01:00
Tapani Pälli
12196d1b76 egl: check for driver_configs in dri2_display_release
With later commits we'll split and reuse the destroy side of the
function for the initialize_foo error path.

In such cases, driver_configs may be NULL leading to a crash.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
[Emil Velikov: reword commit message]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:49 +01:00
Emil Velikov
628af2bc96 gbm: remove unneeded gbm_drm_device abstraction
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:47 +01:00
Emil Velikov
e183c55275 gbm: move gbm_drm_device::driver_name to gbm_dri_device
The former already keeps track of the DRI module opened, based on the
driver_name provided. So let's keep them together.

As a nice bonus this Will allows us to remove the gbm_drm_device all
together with next patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:44 +01:00
Emil Velikov
2204ea6464 gbm: remove "struct gbm_drm_bo" abstraction
The struct is a simple wraper around gbm_bo and brings no actual
benefit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:42 +01:00
Emil Velikov
b5ab59ce37 gbm: remove unused gbm_dri_device::loader
Introduced back in 2012 with fd6acb97fb ("gbm: Create hooks for
dri2_loader_extension in dri backend") and hasn't been used since.

Seemingly a copy/paste thinko from development stage.

Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Herring <robh@kernel.org>
2017-05-29 16:49:39 +01:00
Emil Velikov
2b6ad89d86 radv: automake: list shared libraries after the static ones
Analogous to previous commit - the compiler can discard xcb + wayland
libs, since there is no user (the static libraries) before it on the
command line.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-05-29 16:42:44 +01:00
Emil Velikov
3e8790bff0 anv: automake: list shared libraries after the static ones
The compiler can discard the shared ones from the link chain, since
there is no user (the static libraries) before it on the command line.

Cc: mesa-stable@lists.freedesktop.org
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-05-29 16:42:41 +01:00
Samuel Pitoiset
55083705cf mesa: add KHR_no_error support for glBindImageTextures()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-29 10:11:44 +02:00
Samuel Pitoiset
def908af6c mesa: add KHR_no_error support for glBindImageTexture()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-29 10:11:43 +02:00
Samuel Pitoiset
3ca5da2704 mesa: add bind_image_texture() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-29 10:11:41 +02:00
Samuel Pitoiset
1f75915e1a mesa: add set_image_binding() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-29 10:11:39 +02:00
Samuel Pitoiset
b12dfb1558 mesa: remove unused layered parameter from validate_bind_image_texture()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-29 10:11:36 +02:00
Samuel Pitoiset
5521dc2477 mesa: add KHR_no_error support for glActiveTexture()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-05-29 10:03:11 +02:00
Marek Olšák
48b91103ce radeonsi: use ac_build_buffer_load for shader buffer loads
and document why we can't use SMEM yet.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-29 01:52:16 +02:00
Marek Olšák
e019ea8f4b radeonsi: move building llvm.SI.load.const into ac_build_buffer_load
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-29 01:52:16 +02:00
Marek Olšák
e1942c970f radeonsi: rename readonly_memory -> can_speculate
This is more accurate.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-29 01:52:16 +02:00
Marek Olšák
24306c0b27 radeonsi: fix a crash in si_destroy_context if we fail early
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-29 01:52:16 +02:00
Marek Olšák
c70b0604f0 util: slab_destroy_child should check whether it's been initialized
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-29 01:52:16 +02:00
Bas Nieuwenhuizen
5cd8ab49fd radv: Also signal fence if vkAcquireNextImageKHR returns VK_SUBOPTIMAL_KHR.
It is a successful return.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-29 00:09:45 +02:00
Rob Clark
8fc9702a1b freedreno: fix fence creation fail if no rendering
Android tries to create a FENCE_FD fence without any rendering.  And
then falls over when that fails.  So just always create an initial
batch.

Fixes: e4ad8695 ("freedreno: fix crash when flush() but no rendering")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-28 14:49:27 -04:00
Samuel Pitoiset
ab8fb5a082 radeonsi: drop useless memcmp() check in si_set_blend_color()
cso_set_blend_color() already checks if the old state is different.
Only Nine uses pipe::set_blend_color() directly but I guess it
should use the cache too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-27 18:00:45 +02:00
Roland Scheidegger
d2724fe5bd llvmpipe: add LP_NEW_GS flag for updating vertex info
The vertex information we compute here is really dependent on the last
stage before FS. It just happened to work most of the time because new
GS tend to come with new VS and/or FS...
(The LP_NEW_GS flag was previously set but never used.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-05-27 15:49:21 +02:00
Brian Paul
31ff7bff5a svga: document some incorrect VGPU10 shader translation issues
We have a few mistakes in our shader translation code, but the virtual
GPU is forgiving.

Reviewed-by: Michal Krol <michal@vmware.com>
Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-05-26 20:05:30 -06:00
Jason Ekstrand
21ddab4a17 i965/copy_image: Use the blitter on gen5
This was just an accidental typo in the refactoring.  The intention was
to try the blitter on gen4-5, not just gen4.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-26 14:44:29 -07:00
Alexandre Demers
a958a30827 osmesa: link with libunwind if enabled (v2)
Fixes linking error in libOSmesa when using libunwind.

CXXLD    libOSMesa.la
src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `symbol_name_cached':
./src/gallium/auxiliary/util/u_debug_stack.c:87: undefined reference to `_ULx86_64_get_proc_name'
src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture':
./src/gallium/auxiliary/util/u_debug_stack.c:114: undefined reference to `_Ux86_64_getcontext'
./src/gallium/auxiliary/util/u_debug_stack.c:115: undefined reference to `_ULx86_64_init_local'
./src/gallium/auxiliary/util/u_debug_stack.c:117: undefined reference to `_ULx86_64_step'
./src/gallium/auxiliary/util/u_debug_stack.c:123: undefined reference to `_ULx86_64_get_reg'
./src/gallium/auxiliary/util/u_debug_stack.c:124: undefined reference to `_ULx86_64_get_proc_info'
./src/gallium/auxiliary/util/u_debug_stack.c:120: undefined reference to `_ULx86_64_step'
collect2: error: ld returned 1 exit status

v2 : Fixes title and adds the original error it is fixing.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-26 09:29:44 -06:00
Jason Ekstrand
726b68ad82 i965/blorp: Support copyteximage on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
b06c63c782 i965: Use blorp for CopyImageSubData on gen4-5
We keep the blit path because it's probably faster when it works.
However, now that we can use blorp, we can delete that nasty CPU
fall-back path.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
0901d0bc4c i965: Round copy size to the nearest block in intel_miptree_copy
The width and height of the copy don't have to be aligned to the block
size if they specify the right or bottom edges of the image.  (See also
the comment and asserts right above).  We need to round them up when we
do the division in order to get it 100% right.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
79f2a5541f i965: Use BLORP for color clears on gen4-5
We don't support replicated data clears yet.  Those take a bit more work
and enabling replicated data clears in its own commit is probably better
for bisectibility anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
6d11362d8b i965: Use blorp for color blits on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
fa13ef285d intel/blorp: Assert that no one tries to blit combined depth stencil
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
752d7af77a i965: Add blorp support for gen4-5
Due to complications with things such as URB setup on gen4-5, it's
easier to keep gen4 support in blorp completely internal to i965.  This
makes things a bit awkward because that means there's a file in i965
that includes blorp_priv.h but it's either that or have a file in blorp
that includes brw_context.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
23125b7102 intel/blorp: Set additional brw_wm_prog_key fields on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
79b486f736 i965/gen4: Expose the guts of URB recalculation as a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
0ed6f196fc intel/blorp: Add support for gen4-5 SF programs
As part of enabling support for SF programs, we plumb the SF URB size
through to emit_urb_config.  For now, it's always zero but, on gen4, it
may be something larger.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
8bce7bda45 intel/blorp: Make convert_to_single_slice available outside blorp_blit
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
110061afa2 intel/blorp: Use designated initializers to set up VERTEX_ELEMENTS
We also add a slot variable and use it as an iterator.  This will make
it much easier to conditionally put something between the header and the
vertex position.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
ac79806766 intel/blorp: Rename emit_viewport_state to emit_cc_viewport
The real point of this packet is that it sets up CC_VIEWPORT so that
name is a bit better.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
1f2f90be1f intel/blorp: Make the common genX_blorp_exec code gen4-safe
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
a7f5d6df8a intel/blorp: Re-arrange blorp_genX_exec.h
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
302c0488cf intel/blorp: Don't use ffma directly
It isn't supported prior to gen6 and, on gen6+, NIR will fuse the fmul
and fadd into an ffma automatically for us anyway.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
675ec434f3 intel/blorp: Delete isl_to_gen_ds_surfype
It's no longer used.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
e80f0840bf intel/blorp: Pull the pipeline bits of blorp_exec into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
3d35e5a51e intel/blorp/blit: Add support for normalized coordinates
Gen5 and earlier can't do non-normalized coordinates so we need to
compensate in the shader.  Fortunately, it's pretty easy plumb through.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
18e18a1863 i965: Move clip program compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
9fb8a8775b i965: Move SF compilation to the compiler
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
c30587643e i965/clip: Make brw_clip_prog_key::interp_mode an array
Having it be a pointer means that we end up caching clip programs based
on a pointer to wm_prog_data rather than the actual interpolation modes.
We've been caching one clip program per FS ever since 91d61fbf7c
where Timothy rewrote brw_setup_vue_interpolation().

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
58a57ea7d6 i965/sf: make brw_sf_prog_key::interp_mode an array
Having it be a pointer means that we end up caching clip programs based
on a pointer to wm_prog_data rather than the actual interpolation modes.
We've been caching one clip program per FS ever since 91d61fbf7c
where Timothy rewrote brw_setup_vue_interpolation().

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
21ba2b4bef intel/compiler: Make brw_disasm take const assembly
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
c336c224a6 intel/decoder: Handle the BLT ring in gen_group_get_length
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
9d1001c8e5 intel/decoder: Handle gen4 VF_STATISTICS and PIPELINE_SELECT
These need special handling because they have no "DWord Length"
parameter and they have an unusual bias of 1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
87588e546e intel/genxml: Rename 3DSTATE_AA_LINE_PARAMS on gen5
All of the other gens use "PARAMETERS".

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
04f6d975e1 intel/genxml: Use the right subtype for VF_STATISTICS on gen4
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
1fcc5e2399 intel/genxml: Iron Lake doesn't support non-normalized sampler coordinates
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
648b618dc5 intel/genxml: Add SAMPLER_STATE to gen 4.5
Somehow this got missed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
3f8ee8c703 intel/genxml: Rename the CC_VIEWPORT pointer on gen4-5
It isn't a pointer to "color calc state", that's the packet it's in.
It's a pointer to the CC viewport state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
0ee1ef0cbb intel/genxml: Sampler state is a pointer on gen4-5
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
64243d3b8e intel/genxml: Suffix KSP0 fields on Iron Lake
Iron Lake introduced the multiple KSP thing and so you have KSP0-3.
However, the genxml didn't have an index on the first "Kernel Start
Pointer" or "GRF Register Count".  Add one to match gen6+.  While we're
here, we drop the brackets from the other "GRF Register Count" fields.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
7769e448aa intel/genxml: Make a bunch of things offsets on gen4-5
Most things on gen4-5 are addresses because we don't have dynamic state
base address and we don't have instruction state base on gen4.  However,
whoever converted things to addresses got a little over-excited and
converted too much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
8257fe7b18 intel/isl: Add gen4_filter_tiling
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
332a5d7a3f intel/isl: Add support for setting component write disables
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
8958355549 intel/isl: Add support for gen4 cube maps to get_image_offset_sa
Gen4 cube maps are a 2-D surface with ISL_DIM_LAYOUT_GEN4_3D which is a
bit weird but accurate none the less.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
b9b7792d9a intel/isl: Don't request space for stencil/hiz packets unless needed
On Iron Lake, the packets exist but we never emit them so there's no
need for us to ask the driver to make batch space for them.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
b50b821eb3 i965/blorp: Properly handle mt->first_level
The guts of blorp and ISL don't understand i965's partial miptrees.
Instead, we need to subtract off first_level before we hand anything off
to blorp.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
c16e840f9a i965/miptree: Take first_level into account when converting to ISL
ISL doesn't have a concept of a partial miptree.  Instead, we need to
subtract off first_level.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
554a1731a5 intel/blorp: Move the gen7 stencil format workaround to blorp_blit
It's not needed for blorp_copy because it already overrides formats.
It's also not needed for blorp_clear because it clears stencil as
stencil.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
c19150af5c i965: Use blorp_copy for doing r8 stencil updates on HSW
The blorp_copy entrypoint is designed for doing memcpy like operations
which is what we need to do here while blorp_blit is for handling format
conversion and scaling.  Using blorp_copy is much simpler and prevents
us from getting formats wrong.  While we're here, we get rid of the
layers_per_blit thing since stencil always uses interleaved MSAA.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-26 07:58:01 -07:00
Jason Ekstrand
441cd7a81d i965/blorp: Do and end-of-pipe sync on both sides of fast-clear ops
We've discovered in the Vulkan driver that simply doing the end-of-pipe
sync afterwards is insufficient.  The specific requirement stated in the
PRM is that you have to do one every time you transition between the
tree modes of "clear", "render", and "resolve".  This is GL, so we could
track it but any attempt to do so would most likely get it wrong.  For
now, it's easier to just assume that every fast-clear op is an island
and do the sync both before and after.

This also removes the unneeded flush and stall after slow-clear
operations.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
2017-05-26 07:58:01 -07:00
Eric Engestrom
44b29dd7b6 amd/common: add missing libdrm include path
Fixes: de9dd4f9f1 ("ac/radeonsi: move struct radeon_info to ac_gpu_info.h")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-26 15:19:55 +01:00
Andres Gomez
cd8a9d7dfa docs: small release calendar fixes
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-26 15:08:14 +02:00
Dave Airlie
e1409f7302 Revert "amd/common: add vcn dec ip info query"
This reverts commit 524d4fff9e.

This commit breaks amdgpu on kernels with no DEC IP support.

Caught by the airlied CI system.
2017-05-26 16:36:57 +10:00
Dave Airlie
ae1f32915b Revert "amd/common: set vcn dec as hw decode as well"
This reverts commit 50d322be2f.

A previous patch breaks amdgpu on non-vcn decode systems,
but have to revert this first.
2017-05-26 16:36:38 +10:00
Rob Herring
1dc1860602 util: remove unneeded Android ifdef from ralloc.c
SIZE_MAX has been defined in stdint.h on Android since 2013, so this ifdef
is no longer needed.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-25 15:02:12 -05:00
Rob Herring
151bd66080 nouveau: drop Android 4.4 and earlier support
Support for Android 4.4 and earlier has already been removed from mesa.
Remove this remaining piece from nouveau, too.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-25 15:02:12 -05:00
Rob Herring
0dabb9d9fa i965: use mmap64 for Android
Simplify the handling of mmap for Android by using mmap64 instead. mmap64
may have not existed for Android when this was written, but it's been
around since 2013.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-25 15:01:28 -05:00
Rob Herring
51f9851753 gallium/os: use mmap64 for Android
Simplify the handling of mmap for Android by using mmap64 instead. mmap64
may have not existed for Android when this was written, but it's been
around since 2013.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-25 15:00:34 -05:00
Rob Herring
d5a9365d46 Android: generate an error if building on Android 4.4 or earlier
Since commit 7a5b5f5226 ("Android: drop Android 4.4 (KitKat) support"),
Android 4.4 or earlier is no longer supported, so exit with an error if we
try building on it.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-05-25 14:58:49 -05:00
Brian Paul
48faedbc7e st/wgl: whitespace, formatting fixes in stw_device.c
Trivial.
2017-05-25 11:13:40 -06:00
Brian Paul
12dc843367 glsl: Fix g++ initializer order warning
Fixes this warning:
In file included from ../../../src/compiler/glsl/ir.cpp:25:0:
../../../src/compiler/glsl/ir.h: In constructor 'ir_swizzle::ir_swizzle(ir_rvalue*, ir_swizzle_mask)':
../../../src/compiler/glsl/ir.h:1955:20: warning: 'ir_swizzle::mask' will be initialized after [-Wreorder]
    ir_swizzle_mask mask;
                    ^
../../../src/compiler/glsl/ir.h:1954:15: warning:   'ir_rvalue* ir_swizzle::val' [-Wreorder]
    ir_rvalue *val;
               ^
../../../src/compiler/glsl/ir.cpp:1592:1: warning:   when initialized here [-Wreorder]
 ir_swizzle::ir_swizzle(ir_rvalue *val, ir_swizzle_mask mask)
 ^

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-25 10:35:11 -06:00
Leo Liu
f94cfdc5f2 radeonsi: enable vcn decode
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
7ecc244b14 winsys/amdgpu: add vcn dec cs support
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
50d322be2f amd/common: set vcn dec as hw decode as well
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
524d4fff9e amd/common: add vcn dec ip info query
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
c23ffafc50 radeon: rename has_uvd info to has_hw_decode
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
34f7cf49c8 radeon/vcn: add decode message for mpeg4 codec
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
155ca0ca50 radeon/vcn: add decode message for mpeg2 codec
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
9c93c7c0b4 radeon/vcn: add decode message for vc1 codec
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
a55d2659d9 radeon/vcn: add decode message for hevc codec
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
9c21f6abda radeon/vcn: add decode message decode for avc codec
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
949dd66c9e radeon/vcn: add decode message feedback
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
0152a0cf16 radeon/vcn: add decode message destroy
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
a106866962 radeon/vcn: add decode message create
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
ae4faecf66 radeon/vcn: add common decode part
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
f9b8736776 radeon/winsys: add vcn dec ring type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:20 -04:00
Leo Liu
2094b75c68 radeon/winsys: add uvd enc ring type
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:19 -04:00
Leo Liu
71075a8126 radeon/vcn: add vcn decode interface
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-25 11:40:19 -04:00
Leo Liu
e1f7936d05 configure.ac: update libdrm amdgpu version requirement to 2.4.81
VCN decode has a new interface, and that depends on the latest libdrm

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-25 11:40:19 -04:00
Emil Velikov
6a3ffda83a docs: update calendar, add news item and link release notes for 17.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-25 08:52:14 +01:00
Emil Velikov
1e735800a9 docs: add sha256 checksums for 17.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 092c485b8e)
2017-05-25 08:48:20 +01:00
Emil Velikov
e3ba46f6aa docs: add release notes for 17.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ca0a148a4d)
2017-05-25 08:48:19 +01:00
Timothy Arceri
c8a3bac820 mesa: remove unrequired double calc
type_size() will already handle this correctly for us.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-25 12:20:57 +10:00
Timothy Arceri
fd461b22e9 mesa: remove redundant modulus operation
The if check above means we can only get here if size is less than 4.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-25 12:20:49 +10:00
Brian Paul
4a6fdeab05 svga: init svga_screen::swc_mutex with mtx_recursive
If the SVGA3D_BindGBSurface() call in svga_buffer_hw_storage_unmap()
fails, we'll flush and that might involve unmapping other buffers.
That leads to a recursive lock on svga_screen::swc_mutex and causes
a deadlock.  Fix this by initializing the mutex with mtx_recursive.

Note that this only happened on Linux, not Windows.  On Windows, the
mutex functions are implemented with Win32 critical sections which
support recursive locking.

Also add a comment about this.

Fixes VMware bug 1831549 (Unigine Tropics demo freeze on Linux).

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Neha Bhende<bhenden@vmware.com>
2017-05-24 11:33:47 -06:00
Brian Paul
0c84c395f8 svga: move logging initialization code into new function
Plus a few other minor clean-ups.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-05-24 11:33:47 -06:00
Brian Paul
84233ac661 svga: init local vars to silence uninitialized use warnings
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-05-24 11:33:47 -06:00
Brian Paul
cf1adb7b1c svga: log the process command line to the vmware.log file
This is useful for Piglit when thousands of tests are run and we want
to determine which test triggered a device error.

v2: only log command line info if the new SVGA_EXTRA_LOGGING env var is set

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-05-24 11:33:47 -06:00
Sinclair Yeh
14d1687229 svga: Limit svga message capability to newer compilers
The assembly code used by the SVGA message feature doesn't
build properly with older compilers, so limit it to only
gcc 5.3.0 and newer.

Also modified the stubs to avoid "unused variable" warnings.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-24 11:33:46 -06:00
Brian Paul
c85a35d465 svga: Fix MSVC build.
This let us compile the code with MSVC, but it no-ops the log function.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-05-24 11:33:46 -06:00
Sinclair Yeh
1ce3a2723f svga: Add the ability to log messages to vmware.log on the host.
For now this capability only exists in the SVGA driver but
can be exported later if other modules, e.g. winsys, wants
to use it for logging.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-24 11:33:46 -06:00
Brian Paul
3ad5325da0 Revert "gallium: remove unused PIPE_CC_GCC_VERSION"
This reverts commit e60928f4c4.

PIPE_CC_GCC_VERSION is used by some of our in-house code which hasn't
been upstreamed yet.
2017-05-24 11:33:46 -06:00
Lionel Landwerlin
359fa0e9a0 aubinator: report error on unknown device id
Since we're going to stop aubinator without a valid device id, better
report an error. This also silences a Coverity warning.

CID: 1405004
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Lionel Landwerlin
8f1f1d294d aubinator: be consistent on exit code
We're using both exit(1) & exit(EXIT_FAILURE), settle for one, same
for success.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Lionel Landwerlin
6200d835a0 aubinator: fix double free
1;4601;0c
Free previously allocated filename outside the for loop.

CID: 1405014
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-24 10:50:18 +01:00
Christian König
5318870f54 winsys/amdgpu: align VA allocations to fragment size v2
BOs larger than the minimum fragment size should have their VA
alignet to at least the fragment size for optimal performance.

v2: drop unused leftover from initial implementation

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-24 10:32:19 +02:00
Samuel Pitoiset
51dc5e3df3 tgsi: remove unused tgsi_is_passthrough_shader()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2017-05-24 09:52:17 +02:00
Eric Engestrom
338f47b6d8 configure.ac: rephrase 'GLX w/o X11' error message
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2017-05-24 08:35:44 +01:00
Jason Ekstrand
39adea9330 anv: Require vertex buffers to come from a 32-bit heap
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 17:37:43 -07:00
Jason Ekstrand
50d0eb5096 anv: Advertise both 32-bit and 48-bit heaps when we have enough memory
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 17:37:42 -07:00
Jason Ekstrand
34581fdd4f anv: Refactor memory type setup
This makes us walk over the heaps one at a time and add the types for
LLC and !LLC to each heap.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:42 -07:00
Jason Ekstrand
b83b1af6f6 anv: Make supports_48bit_addresses a heap property
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:40 -07:00
Jason Ekstrand
00df1cd9d6 anv: Stop setting BO flags in bo_init_new
The idea behind doing this was to make it easier to set various flags.
However, we have enough custom flag settings floating around the driver
that this is more of a nuisance than a help.  This commit has the
following functional changes:

 1) The workaround_bo created in anv_CreateDevice loses both flags.
    This shouldn't matter because it's very small and entirely internal
    to the driver.

 2) The bo created in anv_CreateDmaBufImageINTEL loses the
    EXEC_OBJECT_ASYNC flag.  In retrospect, it never should have gotten
    EXEC_OBJECT_ASYNC in the first place.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:38 -07:00
Jason Ekstrand
10fad58b31 anv: Set image memory types based on the type count
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:36 -07:00
Jason Ekstrand
f7736ccf53 anv: Add valid_bufer_usage to the memory type metadata
Instead of returning valid types as just a number, we now walk the list
and check the buffer's usage against the usage flags we store in the new
anv_memory_type structure.  Currently, valid_buffer_usage == ~0.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:34 -07:00
Jason Ekstrand
92325a7efc anv: Determine the type of mapping based on type metadata
Before, we were just comparing the type index to 0.  Now we actually
look the type up in the table and check its properties to determine what
kind of mapping we want to do.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:32 -07:00
Jason Ekstrand
c1f4343807 anv: Set up memory types and heaps during physical device init
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:30 -07:00
Jason Ekstrand
eceaf7e234 anv: Predicate 48bit support on gen >= 8
This doesn't matter right now since it only affects whether or not we
set the kernel bit but, if we ever do anything else based on it, we'll
want it to be correct per-gen.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:27 -07:00
Jason Ekstrand
4eecd534f0 anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hack
Up until now, we've been memsetting the auxiliary surface to 0 at
BindImageMemory time to ensure that it is properly initialized.
However, this isn't correct because apps are allowed to freely alias
memory between different images and buffers so long as they properly
track whether or not a particular image is valid and, if it isn't,
transition from UNINITIALIZED to something else before using it.  We
now implement those transitions so we can drop the hack.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:22 -07:00
Jason Ekstrand
cc45c4bb80 anv: Handle transitioning depth from UNDEFINED to other layouts
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:20 -07:00
Jason Ekstrand
75edecf502 anv: Handle color layout transitions from the UNINITIALIZED layout
This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start
failing due to test bugs.  See CL 1031 for a test fix.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-23 16:46:03 -07:00
Axel Davy
7e04ae74d4 st/nine: Fix a regression and syntax cleanup
A few cleanups and in particular initializing properly
the new pipe_draw_info fields.
This should fix the regression caused by
330d0607ed

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101088

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-24 00:40:43 +02:00
Ian Romanick
7009955281 mesa: Remove GL_APPLE_vertex_array_object stubs
Mark the functions 'exec="skip"' in the XML instead.  libGL will still
have the functions, but the driver won't try to use them.  I verified
that this commit works with piglit's 'object-namespace-pollution glClear
vertex-array' on x64 with a driver built from mesa-12.0.3 tag.

In fairness, this test also works with a libGL built from 7927d03.  I
believe it continues to work because on non-Windows platforms we
generate some extra, dummy dispatch functions that can be used when a
driver requests a function unknown to libGL.  This was done to provide
some "forward" compatibility with drivers that need more functions.
This doesn't work on Windows because the Windows calling convention is
for the callee to clean up the stack.  That's the theory anyway.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-23 15:02:29 -07:00
Marek Olšák
0781b58b3a gallium/radeon: pipe AMDGPU_INFO_NUM_VRAM_CPU_PAGE_FAULTS into gallium HUD
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-23 23:29:16 +02:00
Rob Clark
1db28fbbea freedreno/ir3: switch to NIR by default
Now that we lower vars to regs, we no longer regress for anything that
does complex dereferences.  (With tgsi, derefers are already lowered
before tgsi_to_nir, but not with glsl_to_nir.)  In fact it actually
fixes a few things to bypass tgsi.

So make NIR the default (finally!)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
caa64b24ce freedreno/ir3: lower arrays to regs
Instead of using load/store_var intrinsics, which can have complex
derefs in the case of multi-dimensional arrays, lower these to regs
and handle the direct/indirect loads in get_src() and stores in
put_dst().

This should let us switch to using nir by default.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
232fc99544 freedreno/ir3: add put_dst()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
2bbd425adb freedreno/ir3: code-motion
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
90dade300f freedreno/ir3: fix cmdline compiler
standalone_compiler_cleanup() frees the glsl types, among other things,
so it needs to come after nir->ir3.  But since we exit after dumping the
disassembly, it is easier to just not call it at all.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
1059dc9165 freedreno/ir3: add missing nir_opt_copy_prop_vars() pass
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
c712a637b9 freedreno/ir3: need different compiler options for a5xx
vertex_id_zero_based differs..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
4531e67c47 freedreno/a5xx: remove copapasta from a4xx
Won't ever hit this w/ a420 gpu, so this is dead code.  Need to get astc
working to know whether to rip this out entirely or not.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
0c2e0f15b8 freedreno: only support SSBOs with nir
tgsi_to_nir does not support them.  Note that compute shaders already
force nir.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
444b4b40f9 freedreno/a5xx: add some missing texture formats
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
6ccbbd8d05 freedreno/a5xx: provoking vertex
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
d7f296de26 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark
6f65a1a211 nir/lower-atomics-to-ssbo: remove atomic_uint arrays too
Maybe there is a better way to do this.  But by the time we get to
assigning uniform locs, we want the atomic_uint's to all be gone,
otherwise we assert in st_glsl_attrib_type_size().

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:34 -04:00
Rob Clark
5f6c034f82 nir/lower-atomics-to-ssbo: fix num_components
Fixes some piglits like arb_shader_atomic_counters-active-counters

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:34 -04:00
Timothy Arceri
a363fa0c99 radeon: pass flags that can change shaders to disk_cache_create()
I wasn't sure if I should filter the flags so that we only use
flags that actually change the shader output. To avoid manual
updates we just pass in everything for now.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-23 09:09:43 +10:00
Timothy Arceri
0bbcfbfc0b util/disk_cache: add new driver_flags param to cache keys
This will be used for things such as adding driver specific environment
variables to the key. Allowing us to set environment vars that change
the shader and not have the driver ignore them if it finds existing
shaders in the cache.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-05-23 09:09:43 +10:00
Jose Fonseca
d970f773f4 u_format_test: Ignore S3TC errors.
This prevents spurious failures when libtxc-dxtn-s2tc is installed.

Note: lp_test_format doesn't need any change since we were already
ignoring S3TC failures there.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2017-05-22 21:00:06 +01:00
Nanley Chery
d132bb36ce docs: Document ASTC extension support for SKL and BXT
v2: Remove the '+' after bxt

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-22 11:13:53 -07:00
Nanley Chery
d6150bd764 i965: Enable ASTC HDR for Broxton
This platform passes the following GLES3 tests:
ES3-CTS.functional.texture.compressed.astc.endpoint_value_hdr_cem_*

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-22 11:13:53 -07:00
Nanley Chery
52a6fd9871 intel/isl: Add ASTC HDR to format lists and helpers
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-22 11:13:53 -07:00
Bas Nieuwenhuizen
b2c5e69942 radv: Add compute HTILE fast clear.
Not really what the fast depth clear does, no matter whether you use
EXPCLEAR or not. Seems the fast clear using the DB HW always touches
the main buffer.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen
df91abfe5a radv: Use correct clear words for HTILE.
Did some RE'ing what several HTILE words give when read from a descriptor
with HTILE compression enabled.

Seems to align with -pro usage for D16 too.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen
0b26f0ee4f radv: Add queue masks for htile usage determination.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen
0628580eff radv: Specify semantics of HTILE layout helpers.
And correct implementation to specify only what we support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen
62e182acd0 radv: Don't use a separate can_expclear.
We never use EXPCLEAR clears.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Ian Romanick
7174e3f22b mesa: GL_ARB_shader_subroutine is not optional in core profile
text	   data	    bss	    dec	    hex	filename
7038459	 235248	  37280	7310987	 6f8e8b	32-bit i965_dri.so before
7038227	 235248	  37280	7310755	 6f8da3	32-bit i965_dri.so after
6681438	 303400	  50608	7035446	 6b5a36	64-bit i965_dri.so before
6681254	 303400	  50608	7035262	 6b597e	64-bit i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-22 10:51:26 -07:00
Benedikt Schemmer
b026f45bdd drirc: Add allow_glsl_builtin_variable_redeclaration for Dead Island Riptide Definitive Edition
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-22 19:32:07 +02:00
Marek Olšák
8c069a6a06 gallium/radeon: add a query for monitoring Gallium thread load
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-22 19:23:39 +02:00
Marek Olšák
2beb31bd7c radeonsi/gfx9: compile shaders with +xnack
so that LLVM doesn't allocate SGPRs where XNACK is.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-22 19:23:39 +02:00
Rhys Kidd
499f45163a vc4: Remove dead code in vc4_dump_surface_msaa()
Coverity caught the use of dead code copy-paste for
found_colors[] and num_found_colors.

CID: 1341850
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-22 09:50:22 -07:00
Lionel Landwerlin
30dc56bb5b egl/wayland: verify event queue was allocated
We're already verified that 'window' wasn't NULL, I'm guessing this
allocation error is about the newly created queue.

CID: 1409754
Fixes: 03dd9a88b0 ("egl/wayland: Use per-surface event queues")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-05-22 15:44:38 +01:00
Timothy Arceri
4eb0411ed7 mesa: add APPLE_vertex_array_object stubs
APPLE_vertex_array_object support was removed in 7927d0378f.
However it turns out we can't remove the functions because this
can cause issues when libglapi is used together with DRI
drivers built prior to said commit

Fixes: 7927d0378f ("mesa: drop APPLE_vertex_array_object support")

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-22 14:56:51 +10:00
Timothy Arceri
3ceae88642 glsl: set mask via initialisation list rather than in constructor body
Potentially more efficient as it may avoid the struct being initialised
twice.

Also add var to the initialisation list while we are here.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-22 14:21:55 +10:00
Vladislav Egorov
cf164d9e97 ralloc: Use strnlen() inside of strncat()
If the str is long or isn't null-terminated, strlen() could take a lot
of time or even crash. I don't know why was it used in the first place,
maybe for platforms without strnlen(), but strnlen() is already used
inside of ralloc_strndup(), so this change should not additionally
break anything.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-22 12:34:28 +10:00
Vladislav Egorov
4a47247523 glcpp: Skip unnecessary line continuations removal
Overwhelming majority of shaders don't use line continuations. In my
shader-db only shaders from the Talos Principle and Serious Sam used
them, less than 1% out of all shaders. Optimize for this case, don't
do any copying if no line continuation was found.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-22 12:34:28 +10:00
Vladislav Egorov
b8e792ee25 glcpp: Avoid unnecessary strcmp()
strcmp() is slow. Initiate comparison with "__LINE__" or "__FILE__"
only if the identifier starts with '_', which is rare.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-22 12:34:28 +10:00
Thomas Helland
1575a8146a main: Move hashLockMutex/hashUnlockMutex to header and inline
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-22 09:19:24 +10:00
Thomas Helland
f203a9f7d1 main: Use _mesa_HashLock/UnlockMutex consistently
This is shorter and easier on the eyes. At the same time this
also ensures that we are always asserting that the table pointer
is not NULL. Currently that was not done for all situations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-22 09:17:37 +10:00
Thomas Helland
90dfcc6b32 util: Change the pointer hashing function
Use our knowledge that pointers are at least 4 byte aligned to remove
the useless digits. Then shift by 6, 10, and 14 bits and add this to
the original pointer, effectively folding in the entropy of the higher
bits of the pointer into a 4-bit section. Stopping at 14 means we can
add the entropy from 18 bits, or at least a 600Kbyte section of memory.
Assuming that ralloc allocates from a linearly allocated heap less than
this we can make a very efficient pointer hashing function for our usecase.
Even if we are not on an architecture that is 4 byte aligned, there is
still a high big chance that the thing we are allocating is at least
8 bytes in size, so even then we will have entropy into the third bit.

The 4 bit increment on the shifts is chosen rather arbitrarily; if we
had chosen a 3 bit increment we would need to add another xor to
cover a decently sized memorypool. Increasing it to 5 bits would
spread our entropy more, possibly hurting us with more collisions on
hash tables of size less than 32. With a hash table of size 16 there
are a max of 11 entries, and we can assume that with such a small table
collisions are not that painfull.

This allows us to hash the whole 32 or 64 bit pointer at once,
instead of running FNV1a, looping through each byte and doing
increments, decrements, muls, and xors on every byte. This cuts
_mesa_hash_data from 1.5 % on profiles, to making _mesa_hash_pointer
show up with a 0.09% share. Collisions on insertion actually seems to be
ever so slightly lower with this hash function, as found by printing
a loop counter and sorting the data.

perf stat shows a 1.5% reduction in instruction count,
and a 5% reduction in stalled cycles. Shader-db runtime goes
from 225 to 220 seconds.

No instruction-count changes in shader-db, but there are some minor
changes in cycle-count that is likely caused by nir walking a set
in some of its passes, and this causing a different ordering.
That might eventually lead to a difference in register allocation.
However, the effect is a net positive;

total cycles in shared programs: 24739550 -> 24738482 (-0.00%)
cycles in affected programs: 374468 -> 373400 (-0.29%)
helped: 178
HURT: 49

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-22 09:17:37 +10:00
Philipp Zabel
1586768e74 vulkan/wsi/wayland: Fix proxy wrappers for swapchain recreation
Before the swapchain event queue is destroyed, all proxy objects that reference
it must be dropped. Otherwise we risk a use-after-free if a frame callback event
or buffer release events are received afterwards.
This happens when an application destroys and recreates a swapchain in FIFO
mode between two frames without using the VkSwapchainCreateInfoKHR::oldSwapchain
mechanism to keep the old swapchain until after the next redraw.

Fixes: 5034c61558 ("vulkan/wsi/wayland: Use proxy wrappers for swapchain")
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-20 17:00:08 +01:00
John Brooks
2b878cb8fd drirc: Add allow_glsl_builtin_variable_redeclaration for Dying Light and Dead Island Definitive Edition
This fixes the long-standing problem with Dying Light where the game would
produce a black screen when running under Mesa. This happened because the
game's vertex shaders redeclare gl_VertexID, which is a GLSL builtin.
Mesa's GLSL compiler is a little more strict than others, and would not
compile them:

    error: `gl_VertexID' redeclared

The allow_glsl_builtin_variable_redeclaration directive allows the shaders
to compile and the game to render. The game also requires OpenGL 4.4+ (GLSL
440), but does not request it explicitly. It must be forced with an
override, such as MESA_GL_VERSION_OVERRIDE=4.5 and
MESA_GLSL_VERSION_OVERRIDE=450. A compatibility context is *not* required
and forcing one with 4.5COMPAT or allow_higher_compat_version results in
graphical artifacts.

Dead Island Definitive Edition is another Techland port on the same engine
with the same problems, so we set the
allow_glsl_builtin_variable_redeclaration option for that game as well.

v2 (Samuel Pitoiset):
    - Rename allow_glsl_builtin_redeclaration ->
      allow_glsl_builtin_variable_redeclaration

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96449
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-20 17:30:07 +02:00
John Brooks
6e8f34a2de glsl: Conditionally allow redeclaration of built-in variables
Conditional on allow_glsl_builtin_variable_redeclaration driconf option.

v2 (Samuel Pitoiset):
    - Rename allow_glsl_builtin_redeclaration ->
      allow_glsl_builtin_variable_redeclaration
    - style: put spaces after 'if'

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-20 17:30:05 +02:00
John Brooks
bf4d7671f4 driconf: Add allow_glsl_builtin_variable_redeclaration option
This option will allow GLSL builtins to be redeclared verbatim (e.g.
redeclaring "in int gl_VertexID" in a vertex shader). This is not strictly
valid and would normally fail to compile, but some applications (such as
newer Techland ports) do it and need more leniency.

v2 (Samuel Pitoiset):
    - Rename allow_glsl_builtin_redeclaration ->
      allow_glsl_builtin_variable_redeclaration

Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-20 17:29:55 +02:00
Ilia Mirkin
61d8f3387d nv50,nvc0: clear index buffer bufctx bin unconditionally
The previous condition was to clear it out if it had previously been
set, not what's in the current draw. That information is gone now, so
just clear it unconditionally.

Fixes: 330d0607e ("gallium: remove pipe_index_buffer and set_index_buffer")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-20 04:20:11 -04:00
Ilia Mirkin
85d2186326 nv50: fix vtxbuf cleanup
Use a user-buffer-aware cleanup function.

Fixes: c24c3b94ed ("gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytes")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-20 04:20:11 -04:00
Kenneth Graunke
e781b9e640 i965: Use the upload BO for push constants on Gen7.5-Gen8.
We can easily use the upload BO for push constants on Gen7.5/Gen8 too,
at the cost of a relocation when emitting 3DSTATE_CONSTANT_XS.  We can
simply switch to using constant buffer pointer 2 instead of pointer 0,
like we do on Gen9+.

Ivybridge and Baytrail can't do this trick because they require the
constant buffers to be enabled in order, starting with 0.  We'd have
to set the INSTPM bit to make the constant buffer pointer not relative
to dynamic state base address, which would need kernel command parser
support.

Improves performance in GLBenchmark 2.7/TRex Offscreen by:
- Broadwell GT2: 0.305608% +/- 0.19877% (n = 68)
- Braswell: No difference proven (n = 742)
- Haswell GT3e: 0.180755% +/- 0.0237505% (n = 30)

Reviewed-by: Chris Forbes <chrisforbes@google.com>
2017-05-20 00:23:10 -07:00
Kenneth Graunke
494593e6b2 i965: Use the upload BO for push constants on Gen9+.
Shaders can use quite a bit of uniform data.  Better to put it in the
upload buffers, like we do for client vertex data, rather than the
batch buffer state area, which is primarly used for indirect state.

This should free up batch space, allowing us to emit more commands in a
batch before flushing.  Because BRW_NEW_BATCH also causes a lot of state
to be re-emitted, it may also reduce CPU overhead a little bit.

We took this approach on Gen4-5, but switched to using the batch area
on Gen6+ because buffer 0 is relative to Dynamic State Base Address by
default, which is set to the start of the batch.

On Gen9+, we already use a relocation due to a workaround, so this is
trivial to change and has basically no downside.

Unfortunately we can't change compute shader push constants because
MEDIA_CURBE_LOAD always uses an offset from dynamic state base address.

Improves performance in GLBenchmark 2.7/TRex Offscreen by:
- Skylake GT4e: 0.52821% +/- 0.113402% (n = 190)
- Apollolake: 0.510225% +/- 0.273064% (n = 70)

Reviewed-by: Chris Forbes <chrisforbes@google.com>
2017-05-20 00:23:10 -07:00
Kenneth Graunke
731b577cc6 i965: Drop BRW_NEW_PUSH_CONSTANT_ALLOCATION from CS packets.
I don't think CS push constant uploading uses the section of L3
controlled by 3DSTATE_PUSH_CONSTANT_ALLOC_XS.  So I don't think
it needs to be re-emitted when that space is reallocated.

The programming note in gen7_allocate_push_constants doesn't
indicate this is necessary, at least.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
2017-05-20 00:23:10 -07:00
Ilia Mirkin
82e77d4e44 nvc0/ir: SHLADD's middle source must be an immediate
The instruction encodings only allow for immediates. Don't try to
replace a zero (which is dumb to have in that op in any case) with RZ.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-05-20 03:12:40 -04:00
Tapani Pälli
f0051fcf2b android: add -Wl,--build-id=sha1 to LDFLAGS for libvulkan_intel
Just like is done on desktop and what is expected by the build-id code.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-20 08:59:57 +03:00
Emil Velikov
48cd1919ff configure.ac: s/xcb-fixes/xcb-xfixes/
Former is not a thing, even if I have a hacked xcb-fixes.pc on my system.
Thanks for spotting it Mark!

Fixes: 9a90d6a9d4 ("configure.ac: add xcb-fixes to the XCB DRI3 list")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-20 01:15:02 +01:00
Emil Velikov
9a90d6a9d4 configure.ac: add xcb-fixes to the XCB DRI3 list
The XCB module is used by the VL targets. Thus omitting it can lead to
link-time errors due to unresolved symbols.

Other DRI3 users such as the Vulkan WSI and the dri3 loader helper do
not use an update region in their xcb_present_pixmap() call. We will
look into that at a later stage.

Fixes: acf3d2afab ("configure: check once for DRI3 dependencies")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101110
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-20 00:12:56 +01:00
Emil Velikov
5233eaf9ee automake: add SWR LLVM gen_builder.hpp workaround
As gen_builder.hpp file is generated, it contains information that is
specific to the LLVM version it originates from.

As suggested by Tim, the file seems to be forwards compatible. So in
order to produce ship a file which will work everywhere we should be
using earlies supported LLVM - 3.9.

With this we're back on track and can build all of mesa without
python/mako/flex and friends.

In the long term we might want to see if the python generators can be
updated to produce LLVM version agnostic files. At least within the
range supported by SWR.

Cc: <mesa-stable@lists.freedesktop.org>
Cc: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-05-20 00:12:56 +01:00
Timothy Arceri
80e643345e st/mesa: don't mark the program as in cache_fallback when there is cache miss
When we fallback currently the gl_program objects are re-allocated.

This is likely to change when the i965 cache lands, but for now
this fixes a crash when using MESA_GLSL=cache_fb. This env var
simulates the fallback path taken when a tgsi cache item doesn't
exist due to being evicted previously or some kind of error.

Unlike i965 we are always falling back at link time so it's safe to
just re-allocate everything. We will be unnecessarily freeing and
re-allocate a bunch of things here but it's probably not a huge deal,
and can be changed when the i965 code lands.

Fixes: 0e9991f957 ("glsl: don't reference shader prog data during cache fallback")

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-20 08:35:51 +10:00
Timothy Arceri
a74300c7ff mesa: add an env var to force cache fallback
For the gallium state tracker a tgsi binary may have been evicted
from the cache to make space. In this case we would take the
fallback path and recompile/link the shader.

On i965 there are a number of reasons we can get to the program
upload stage and have neither IR nor a valid cached binary.
For example the binary may have been evicted from the cache or
we need a variant that wasn't previously cached.

This environment variable enables us to force the fallback path that
would be taken in these cases and makes it easier to debug these
otherwise hard to reproduce scenarios.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-20 08:35:51 +10:00
Timothy Arceri
8cad301a3e st/mesa: improve shader cache debug info
This will explicitly state that we are following the fallback
path when we find invalid/corrupt cache items. It will also
output the fallback message when the fallback path is forced
via an environment variable, the following patches will allow
this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-20 08:35:51 +10:00
Emil Velikov
552cd5cce5 travis: remove workarounds for the Vulkan target
Previously we required --enable-egl for the platform selection to work.
Additionally due to the broken DRI3 dependency tracking we needed
--enable-glx.

Since both of these are now sorted now we no longer need the
workarounds.

While we're here, explicitly enable dri3.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:55 +01:00
Emil Velikov
5ab6ded0a9 configure: trivial whitespace cleanup
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:55 +01:00
Emil Velikov
b496fc2932 configure: error out if building XVMC w/o supported platform
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:55 +01:00
Emil Velikov
037e9d37b4 configure: error out if building VDPAU w/o supported platform
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:54 +01:00
Emil Velikov
1914c814a6 configure: error out if building OMX w/o supported platform
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:54 +01:00
Emil Velikov
63e11ac2b5 configure: error out if building VA w/o supported platform
A bit pedantic patch to fool proof should someone start thinkering
without knowing what they do.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-19 19:46:54 +01:00
Emil Velikov
912f24fd32 st/xvmc: add DRI3 support
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-19 19:46:54 +01:00
Emil Velikov
fdc90e1286 st/omx: add DRI3 support
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com>
2017-05-19 19:46:54 +01:00
Emil Velikov
fcbedce310 gallium/targets: link against XCB only as needed
OMX and VA can optionally use the X11 DRI2/DRI3, thus we should link
only as required.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:54 +01:00
Emil Velikov
115cb729d8 st/omx: fix building against X11-less setups
The vl_*_screen_create API properly falls back to a NOP when we're
building without specific platforms. So the only thing we need is to
handle the lack of X11/Xlib.h and provide a dummy Display define.

Cc: <mesa-stable@lists.freedesktop.org>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:49 +01:00
Emil Velikov
d71ce62e84 st/omx: remove unneeded X11 include
En route to a X11-less builds

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:48 +01:00
Emil Velikov
8b9868ad4c st/omx: remove unused drm_driver.h includes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:47 +01:00
Emil Velikov
28703d605d st/va: check if vl_*_screen_create has failed only once
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov
aaea53c2c0 st/va: fix misplaced closing bracket
It's been like this since the code was introduced.

Fixes: 86eb4131a9 (st/va: add headless support, i.e. VA_DISPLAY_DRM)
Cc: <mesa-stable@lists.freedesktop.org>
Cc: Julien Isorce <julien.isorce@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov
c34a008891 st/va: move variable declaration to where its used
... and make it const, since we shouldn't tinker with it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-19 19:46:46 +01:00
Emil Velikov
369e5dd939 auxiliary/vl: use vl_*_screen_create stubs when building w/o platform
Provide a dummy stub when the user has opted w/o said platform, thus
we can build the binaries without unnecessarily requiring X11/other
headers.

In order to avoid build and link-time issues, we remove the HAVE_DRI3
guards in the VA and VDPAU state-trackers.

With this change st/va will return VA_STATUS_ERROR_ALLOCATION_FAILED
instead of VA_STATUS_ERROR_UNIMPLEMENTED. That is fine since upstream
users of libva such as vlc and mpv do little error checking, let
alone distinguish between the two.

Cc: Leo Liu <leo.liu@amd.com>
Cc: Guttula, Suresh <Suresh.Guttula@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:46:41 +01:00
Emil Velikov
05043e0e8e configure: error out when building X11 Vulkan without DRI3
Vulkan supports only DRI3 enabled X11 platforms. Make it obvious,
should one consider building without it.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:25 +01:00
Emil Velikov
d80d6d662e loader: build libloader_dri3_helper.la only with HAVE_PLATFORM_X11
Pretty much every other place does the same.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:22 +01:00
Emil Velikov
a24dc36dde vulkan: automake: remove unused VULKAN_LIB_DEPS variable
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:17 +01:00
Emil Velikov
acf3d2afab configure: check once for DRI3 dependencies
Currently we are having the XCB_DRI3 dependencies duplicated,
partially.

Just do a once-off check and add all of the respective CFLAGS/LIBS
where needed.

As a nice side effect this helps us solve a couple of FIXMEs.

DRI3 is not a thing w/o X11 so disable it in such cases.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:15 +01:00
Emil Velikov
8212fc95b5 configure: error out when building GLX w/o the X11 platform
Building EGL/Vulkan/other without X11, while GLX is enabled is confusing
and misleading. In practise anyone aiming at the former will also
disable GLX.

The inverse (some examples below) should still work:
 ./configure --disable-glx --with-platforms=x11 --with-vulkan-drivers=intel
 ./configure --disable-glx --with-platforms=x11 --enable-egl

Keep in mind that the X11 platform is enabled, by default.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:12 +01:00
Emil Velikov
f353f844a0 configure: set HAVE_foo_PLATFORM as applicable
Rather than having multiple places that define the macros, do it just
once in configure. Makes existing code a bit shorter and easier to
manage as we fix the VL targets with follow-up commits.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:09 +01:00
Emil Velikov
2d35773221 configure: enable the surfaceless platform by default
A simple platform that you want to use in a many usecases. See the
spec file details.

It has no special requirements plus it takes less than a second to
build.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:06 +01:00
Emil Velikov
edb5a65f93 configure: loosen --with-platforms heuristics
Remove the enable-egl pre-requirement. Platform selection does not
depend on EGL.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:04 +01:00
Emil Velikov
73682f82bc configure: update remaining --with-egl-platforms references
Rename the remaining references to omit the egl part.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:02 +01:00
Emil Velikov
27737e7e84 configure: rename remaining HAVE_EGL_PLATFORM_* guards
Analogous to others earlier, these will be used to control the platform
for more than the EGL driver.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:44:00 +01:00
Emil Velikov
3208fd2e46 configure: move platform handling further up
We'll need it for the Vulkan drivers and the VL targets.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-19 19:43:51 +01:00
Rob Herring
de6f3cce8c Android: r600: fix build when LLVM is disabled
There's still an error after my recent clean-up if LLVM is not patched to
enable AMDGPU target:

external/mesa3d/src/amd/common/ac_llvm_util.c:38:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTargetInfo();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:39:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTarget();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:40:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUTargetMC();
        ^
external/mesa3d/src/amd/common/ac_llvm_util.c:41:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        LLVMInitializeAMDGPUAsmPrinter();
        ^

We need to drop libmesa_amd_common when LLVM is disabled, however there's
still a dependency on include paths for ac_binary.h. So explicitly add the
include path when LLVM is disabled.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-19 19:03:08 +01:00
Rob Herring
5771ecc90e virgl: fix virgl_bo_transfer_{put, get} box struct copy
Commit 3dfe61ed6e ("gallium: decrease the size of pipe_box - 24 -> 16
bytes") changed the size of pipe_box, but the virgl code was relying on
pipe_box and drm_virtgpu_3d_box structs having the same size/layout doing
a struct copy. Copy the fields one by one instead.

Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Fixes: 3dfe61ed6e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-19 19:02:32 +01:00
Emil Velikov
e19ea928b9 egl: add g_egldispatchstubs.h to the release tarball
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-19 19:02:12 +01:00
Tapani Pälli
f347bac30f egl/android: fix segfault within swap_buffers
Function droid_swap_buffers may get called without dri2_surf->buffer set,
in these cases we don't have a back buffer set either. Patch fixes segfault
seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI.

backtrace:
   #00 pc 00013f88  /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104)
   #01 pc 000117b2  /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50)
   #02 pc 000058b2  /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386)
   #03 pc 00011329  /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553)
   #04 pc 000118e7  /system/lib/libEGL.so (eglSwapBuffers+55)
   #05 pc 000754dc  /system/lib/libandroid_runtime.so

Note, this is v1 as v2 caused dEQP regressions.

Fixes: 2acc69d ("EGL/Android: Add EGL_EXT_buffer_age extension")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-19 13:57:52 +03:00
Daniel Stone
1f2d0093bf egl/wayland: Ensure we get a back buffer
Commit 9ca6711faa changed the Wayland winsys to only block for the
frame callback inside SwapBuffers, rather than get_back_bo. get_back_bo
would perform a single non-blocking Wayland event dispatch, to try to
find any release events which we had pulled off the wire but not
actually processed. The blocking dispatch was moved to SwapBuffers.

This removed a guarantee that we would've processed all events inside
get_back_bo(), and introduced a failure whereby the server could've sent
a buffer release event, but we wouldn't have read it. In clients
unconstrained by SwapInterval (rendering ~as fast as possible), which
were being displayed directly without composition (buffer release delayed),
this could lead to get_back_bo() failing because there were no free
buffers available to it.

The drawing rightly failed, but this was papered over because of the
path in eglSwapBuffers() which attempts to guarantee a BO, in order to
support calling SwapBuffers twice in a row with no rendering actually
having been performed.

Since eglSwapBuffers will perform a blocking dispatch of Wayland
events, a buffer release would have arrived by that point, and we
could then choose a buffer to post to the server. The effect was that
frames were displayed out-of-order, since we grabbed a frame with random
past content to display to the compositor.

Ideally get_back_bo() failing should store a failure flag inside the
surface and cause the next SwapBuffers to fail, but for the meantime,
restore the correct behaviour such that get_back_bo() no longer fails.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98833
Fixes: 9ca6711faa ("Revert "wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers"")
2017-05-19 09:36:19 +01:00
Daniel Stone
03dd9a88b0 egl/wayland: Use per-surface event queues
During display initialisation, we need a separate event queue to handle
the registry events, which is correctly handled. But we also need
separate per-surface event queues to handle swapchain-related events,
such as surface frame events and buffer release events. This avoids two
surfaces from the same EGLDisplay, both current on separate threads,
dispatching each other's events.

Create separate per-surface event queues, create wl_surface and wl_drm
proxy wrapper objects per surface, so we eliminate the race around
sending events to the wrong queue. swrast buffers do not need a
dedicated proxy wrapper, as the wl_shm_pool used to create the
wl_buffers, being transient, can itself be assigned to a queue.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 36b9976e1f ("egl/wayland: Avoid race conditions when on non-main thread")
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:36:15 +01:00
Daniel Stone
8118bc269f egl/wayland: Don't open-code roundtrip
wl_display_roundtrip_queue() exists and can replace roundtrip(). The
API was introduced with wayland 1.6, while we currently require 1.11.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:36:11 +01:00
Daniel Stone
5034c61558 vulkan/wsi/wayland: Use proxy wrappers for swapchain
Though most swapchain operations used a queue, they were racy in that
the object was created with the queue only set later, meaning that its
event could potentially be dispatched from the default queue in between
these two steps.

Use proxy wrappers to avoid this race, also assigning wl_buffers created
for the swapchain to the event queue.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:36:06 +01:00
Daniel Stone
c902a1957d vulkan/wsi/wayland: Use per-display event queue
Calling random callbacks on the display's event queue is hostile, as
we may call into client code when it least expects it. Create our own
event queue, one per wsi_wl_display, and use that for the registry.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:36:03 +01:00
Daniel Stone
afe8c8a299 vulkan/wsi/wayland: Remove roundtrip when creating image
There's no need to call wl_display_roundtrip() after trying to create a
buffer through wl_drm; if it succeeds then everything is fine, and if it
fails, then we get a fatal protocol error so can't recover anyway.

Additionally, doing a roundtrip on the default / main application queue,
is destructive anyway, so would need to be its own queue.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:36:01 +01:00
Daniel Stone
d9a8bba7f4 vulkan: Fix Wayland uninitialised registry
Untangle the exit cleanup paths so we don't try to use the registry
variable before it's been initialised.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-19 09:34:52 +01:00
Nanley Chery
688ddb85c8 i965/formats: Update the three-channel DXT1 mappings
The procedure for decompressing an opaque DXT1 OpenGL format is
dependant on the comparison of two colors stored in the first 32 bits of
the compressed block. Here's the specified OpenGL behavior for
reference:

   The RGB color for a texel at location (x,y) in the block is given by:

      RGB0,              if color0 > color1 and code(x,y) == 0
      RGB1,              if color0 > color1 and code(x,y) == 1
      (2*RGB0+RGB1)/3,   if color0 > color1 and code(x,y) == 2
      (RGB0+2*RGB1)/3,   if color0 > color1 and code(x,y) == 3

      RGB0,              if color0 <= color1 and code(x,y) == 0
      RGB1,              if color0 <= color1 and code(x,y) == 1
      (RGB0+RGB1)/2,     if color0 <= color1 and code(x,y) == 2
      BLACK,             if color0 <= color1 and code(x,y) == 3

The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1.
This means that the behavior is incompatible with OpenGL. This is stated
in the SKL PRM, Vol 5: Memory Views:

   Opaque Textures (DXT1_RGB)
      Texture format DXT1_RGB is identical to DXT1, with the exception that the
      One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
      the resulting texel color is derived strictly from the Opaque Color Encoding.
      The alpha channel defaults to 1.0.

      Programming Note
      Context: Opaque Textures (DXT1_RGB)
      The behavior of this format is not compliant with the OGL spec.

The opaque and non-opaque DXT1 OpenGL formats are specified to be
decoded in exactly the same way except the BLACK value must have a
transparent alpha channel in the latter. Use the four-channel BC1 Intel
formats with the alpha set to 1 to provide the behavior required by the
spec. Note that the alpha is already set to 1 for RGB formats in
brw_get_texture_swizzle().

v2: Provide a more detailed commit message (Kenneth Graunke).
v3: Ensure the alpha channel is set to 1 for DXT1 formats.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <mesa-stable@lists.freedesktop.org>
Acked-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-18 16:46:15 -07:00
Nanley Chery
56458cb168 anv/formats: Update the three-channel BC1 mappings
The procedure for decompressing an opaque BC1 Vulkan format is dependant on the
comparison of two colors stored in the first 32 bits of the compressed block.
Here's the specified OpenGL (and Vulkan) behavior for reference:

   The RGB color for a texel at location (x,y) in the block is given by:

      RGB0,              if color0 > color1 and code(x,y) == 0
      RGB1,              if color0 > color1 and code(x,y) == 1
      (2*RGB0+RGB1)/3,   if color0 > color1 and code(x,y) == 2
      (RGB0+2*RGB1)/3,   if color0 > color1 and code(x,y) == 3

      RGB0,              if color0 <= color1 and code(x,y) == 0
      RGB1,              if color0 <= color1 and code(x,y) == 1
      (RGB0+RGB1)/2,     if color0 <= color1 and code(x,y) == 2
      BLACK,             if color0 <= color1 and code(x,y) == 3

The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1. This
means that the behavior is incompatible with OpenGL and Vulkan. This is stated
in the SKL PRM, Vol 5: Memory Views:

   Opaque Textures (DXT1_RGB)
      Texture format DXT1_RGB is identical to DXT1, with the exception that the
      One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
      the resulting texel color is derived strictly from the Opaque Color Encoding.
      The alpha channel defaults to 1.0.

      Programming Note
      Context: Opaque Textures (DXT1_RGB)
      The behavior of this format is not compliant with the OGL spec.

The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in
exactly the same way except the BLACK value must have a transparent alpha
channel in the latter. Use the four-channel BC1 Intel formats with the alpha
set to 1 to provide the behavior required by the spec.

v2 (Kenneth Graunke):
- Provide a more detailed commit message.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-05-18 16:46:15 -07:00
Jason Ekstrand
c499faebd7 anv: Add an option to abort on device loss
This is mostly for running in our CI system to prevent dEQP from
continuing on to the next test if we get a GPU hang.  As it currently
stands, dEQP uses the same VkDevice for almost all tests and if one of
the tests hangs, we set the anv_device::device_lost flag and report
VK_ERROR_DEVICE_LOST for all queue operations from that point forward
without sending anything to the GPU.  dEQP will happily continue trying
to run tests and reporting failures until it eventually gets crash that
forces the test runner to start over.  This circumvents the problem by
just aborting the process if we ever get a GPU hang.  Since this is not
the recommended behavior most of the time, we hide it behind an
environment variable.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-18 16:32:11 -07:00
Jason Ekstrand
53f997de77 anv: Wrap the device lost error in vk_error in QueueSubmit
We weren't wrapping this before because anv_cmd_buffer_execbuf may throw
a more meaningful error message.  However, we do change the error code
into VK_ERROR_DEVICE_LOST, so we should print a new message.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-18 16:32:11 -07:00
Marek Olšák
807e1d2577 radeonsi/gfx9: use CE RAM optimally
On GFX9 with only 4K CE RAM, define the range of slots that will be
allocated in CE RAM. All other slots will be uploaded directly. This will
switch dynamically according to which slots are used by current shaders.

GFX9 CE usage should now be similar to VI instead of being often disabled.

Tested on VI by taking the GFX9 CE allocation codepath and setting
num_ce_slots = 2 everywhere to get frequent switches between both modes.
CE is still disabled on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
1cde473ec0 radeonsi: remove CE offset alignment restriction
This was only needed by LOAD_CONST_RAM, which is now only used to load
whole CE.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
a7f098fb76 radeonsi: only upload (dump to L2) those descriptors that are used by shaders
This decreases the size of CE RAM dumps to L2, or the size of descriptor
uploads without CE.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
53c2ef36da radeonsi: record which descriptor slots are used by shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
38828094e9 radeonsi: update si_ce_needed_cs_space
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
edb59ef2dc radeonsi: do only 1 big CE dump at end of IBs and one reload in the preamble
A later commit will only upload descriptors used by shaders, so we won't do
full dumps anymore, so the only way to have a complete mirror of CE RAM
in memory is to do a separate dump after the last draw call.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
06690e63f7 radeonsi: remove early return in si_upload_descriptors
All updates of descriptors_dirty also set dirty_mask, so the return is
unnecessary. The next commit will want this function to be executed
even if dirty_mask == 0.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
b8f8d9e46c radeonsi: clamp indirect index to the number of declared shader resources
We'll do partial uploads of descriptor arrays, so we need to clamp
against what shaders declare.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
f07c15ef80 radeonsi: merge sampler and image descriptor lists into one
Sampler slots: slot[8], .. slot[39] (ascending)
Image slots: slot[7], .. slot[0] (descending)

Each image occupies 1/2 of each slot, so there are 16 images in total,
therefore the layout is: slot[15], .. slot[0]. (in 1/2 slot increments)

Updating image slot 2n+i (i <= 1) also dirties and re-uploads slot 2n+!i.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
5df24c3fa6 radeonsi: merge constant and shader buffers descriptor lists into one
Constant buffers: slot[16], .. slot[31] (ascending)
Shader buffers: slot[15], .. slot[0] (descending)

The idea is that if we have 4 constant buffers and 2 shader buffers, we only
have to upload 6 slots. That optimization is left for a later commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
d88ca12350 gallium/u_threaded: add a fast path for unbinding shader buffers
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
d4c8f429d1 gallium/u_threaded: add a fast path for unbinding shader images
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
ae7f7e8162 st/mesa: silence a valgrind warning in u_threaded_context due to st_draw_vbo
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Marek Olšák
767868ff6d glsl_to_tgsi: declare all SSBOs and atomics when indirect indexing is used
Only the first array element was declared, so tgsi_shader_info::
shader_buffers_declared didn't match what the shader was using.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 22:15:02 +02:00
Samuel Pitoiset
1468e29e02 radeonsi: get the sampler view type from inst->Texture for TG4
This will also magically fix this special lowering for
bindless samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset
5cb2eee557 tgsi: store the sampler view type directly in the instruction
RadeonSI needs to do a special lowering for Gather4 with integer
formats, but with bindless samplers we just can't access the index.

Instead, store the return type in the instruction like the target.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Samuel Pitoiset
ac3f6bf608 tgsi: remove some unused OPCODE macros
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-18 21:48:16 +02:00
Tom Stellard
14e525a4d7 gallivm: Make sure module has the correct data layout when pass manager runs
The datalayout for modules was purposely not being set in order to work around
the fact that the ExecutionEngine requires that the module's datalayout
matches the datalayout of the TargetMachine that the ExecutionEngine is
using.

When the pass manager runs on a module with no datalayout, it uses
the default datalayout which is little-endian.  This causes problems
on big-endian targets, because some optimizations that are legal on
little-endian or illegal on big-endian.

To resolve this, we set the datalayout prior to running the pass
manager, and then clear it before creating the ExectionEngine.

This patch fixes a lot of piglit tests on big-endian ppc64.

Cc: mesa-stable@lists.freedesktop.org
2017-05-18 17:52:47 +00:00
Chad Versace
8f62d21bd7 egl: Partially revert 23c86c74, fix eglMakeCurrent
Fixes regressions in Android CtsVerifier.apk on Intel Chrome OS devices
due to incorrect error handling in eglMakeCurrent. See below on how to
confirm the regression is fixed.

This partially reverts

    commit 23c86c74cc
    Author:  Chad Versace <chadversary@chromium.org>
    Subject: egl: Emit error when EGLSurface is lost

The problem with commit 23c86c74 is that, once an EGLSurface became
lost, the app could never unbind the bad surface. Each attempt to unbind
the bad surface with eglMakeCurrent failed with EGL_BAD_CURRENT_SURFACE.

Specificaly, the bad commit added the error handling below. #2 and #3
were right, but #1 was wrong.

    1. eglMakeCurrent emits EGL_BAD_CURRENT_SURFACE if the calling
       thread has unflushed commands and either previous surface is no
       longer valid.

    2. eglMakeCurrent emits EGL_BAD_NATIVE_WINDOW if either new surface
       is no longer valid.

    3. eglSwapBuffers emits EGL_BAD_NATIVE_WINDOW if the swapped surface
       is no longer valid.

Whe I wrote the bad commit, I misunderstood the EGL spec language
for #1. The correct behavior is, if I understand correctly now, is
below. This patch doesn't implement the correct behavior, though, it
just reverts the broken behavior.

    - Assume a bound EGLSurface is no longer valid.
    - Assume the bound EGLContext has unflushed commands.
    - The app calls eglMakeCurrent. The spec requires eglMakeCurrent to
      implicitly flush. After flushing, eglMakeCurrent emits
      EGL_BAD_CURRENT_SURFACE and does *not* alter the thread's
      current bindings.
    - If the app calls eglMakeCurrent again, and the app inserts no
      commands into the GL command stream between the two eglMakeCurrent
      calls, then this second eglMakeCurrent succeeds without emitting an
      error.

How to confirm this fixes the regression:

    Download android-cts-verifier-7.1_r5-linux_x86-x86.zip from
    source.android.com, unpack, and `adb install CtsVerifier.apk`.
    Run test "Projection Cube". Click the Pass button (a
    green checkmark). Then run test "Projection Widget". Confirm that
    widgets are visible and that logcat does not complain about
    eglMakeCurrent failure.

    Then confirm there are no regressions in the cts-traded module that
    commit 263243b1 fixed:

        cts-tf > run cts --skip-preconditions --skip-device-info \
                 -m CtsCameraTestCases \
                 -t android.hardware.camera2.cts.RobustnessTest

    Tested with Chrome OS board "reef".

Fixes: 23c86c74 (egl: Emit error when EGLSurface is lost)
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Nicolas Boichat <drinkcat@chromium.org>
Cc: Emil Velikov <emil.velikov@collabora.com>
2017-05-18 10:25:52 -07:00
Iago Toral Quiroga
2322ddf548 anv: fix multiview for clear commands
According to the VK_KHX_multiview spec:

"Multiview causes all drawing and clear commands in the subpass to
behave as if they were broadcast to each view, where each view is
represented by one layer of the framebuffer attachments."

This adds support for multiview clears, which were missing in the
initial implementation.

v2 (Jason):
  - split multiview from regular case
  - Use for_each_bit() macro

Fixes new CTS multiview tests:
dEQP-VK.multiview.clear_attachments.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-18 11:53:25 +02:00
Nicolai Hähnle
70215a23c6 ac: add missing extern "C" guards
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:53 +02:00
Nicolai Hähnle
6c01c4b907 ac: add radeon_info::num_{sdma,compute}_rings
Vulkan needs them.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:53 +02:00
Nicolai Hähnle
c488bf24ed ac: add radeon_surf::htile_slice_size
Vulkan needs it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
98a2492290 ac_surface: use radeon_info from ac_gpu_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
988c866212 ac/radeonsi: move radeon_info initialization to amd/common
v2: update Android.common.mk (Emil)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
de9dd4f9f1 ac/radeonsi: move struct radeon_info to ac_gpu_info.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
4d6e75776d ac/radeonsi: move some aspects of sanity checking to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
00f466bad9 ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:52 +02:00
Nicolai Hähnle
8aabed64c3 ac/radeonsi: move the bulk of gfx9_surface_init to ac_surface
We can now merge the two *_surface_init functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
db77cd879b ac/radeonsi: move the bulk of gfx6_surface_init to ac_surface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
f187a49322 ac/radeonsi: move amdgpu_addr_create to ac_surface
v2:
- update Android.common.mk (Emil)
- rebase on top of Raven support

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
15a844986a ac/radeonsi: move surface definitions to new header ac_surface.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:48:51 +02:00
Nicolai Hähnle
377877ff5f st/mesa: remove an incorrect assertion
There is really no reason why the current DrawBuffer needs to be complete
at this point. In particular, the assertion gets hit on the X server side
in libglx when running .../piglit/bin/glx-get-current-display-ext -auto
(which uses indirect GLX rendering).

Fixes: 19b61799e3 ("st/mesa: don't cast the incomplete framebufer to st_framebuffer")
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-18 11:47:27 +02:00
Samuel Iglesias Gonsálvez
e69e5c7006 i965/vec4: load dvec3/4 uniforms first in the push constant buffer
Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.

This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.

v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
  of the push constant buffer slots, as this patch does not pack the
  uniforms.

v3:
- Implemented the push constant buffer usage optimization.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez
8aa6ada838 i965/vec4: fix swizzle and writemask when loading an uniform with constant offset
It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.

Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.

The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.

v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:54 +02:00
Samuel Iglesias Gonsálvez
354f7f2cb9 i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution
We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.

Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-05-18 06:49:28 +02:00
Eric Anholt
e8ea42d245 vc4: Don't allocate new BOs to avoid synchronization when they're shared.
If X11 did a software fallback to the entire screen, we would throw out
the BO the screen is scanning out from and allocate a new one.

Cc: mesa-stable@lists.freedesktop.org
2017-05-17 14:18:29 -07:00
Eric Anholt
50e78cd04f vc4: Drop pointless indirections around BO import/export.
I've since found them to be more confusing by adding indirections than
clarifying by screening off resources from the handle/fd import/export
process.
2017-05-17 14:18:26 -07:00
Eric Anholt
76e4ab5715 vc4: Drop the u_resource_vtbl no-op layer.
We only ever attached one vtbl, so it was a waste of space and
indirections.
2017-05-17 14:18:26 -07:00
Marek Olšák
bd4b224fa6 gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSED
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
4f50c91c32 mesa: don't check mapped buffers in every draw call if drivers allow it
Before: DrawElements (16 VBOs) w/ no state change: 4.34 million/s
After:  DrawElements (16 VBOs) w/ no state change: 8.80 million/s

This inefficiency was uncovered by Timothy Arceri's no_error work.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
d02d8ea8b6 mesa: add gl_constants::AllowMappedBuffersDuringExecution
for skipping mapped-buffer checking in every GL draw call

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Marek Olšák
50189379fa gallium: add PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTION
for skipping mapped-buffer checking in every GL draw call

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 20:28:44 +02:00
Hans de Goede
84f764a759 glxglvnddispatch: Add missing dispatch for GetDriverConfig
Together with some fixes to xdriinfo this fixes xdriinfo not working
with glvnd.

Since apps (xdriinfo) expect GetDriverConfig to work without going to
need through the dance to setup a glxcontext (which is a reasonable
expectation IMHO), the dispatch for this ends up significantly different
then any other dispatch function.

This patch gets the job done, but I'm not really happy with how this
patch turned out, suggestions for a better fix are welcome.

Cc: Kyle Brenneman <kbrenneman@nvidia.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-05-17 20:02:18 +02:00
Tim Rowley
dce41f7728 swr: don't use AttributeSet with llvm >= 5
This change fixes the build break with llvm-svn.

r301981 of llvm-svn made add/remove of function attributes
use AttrBuilder instead of AttributeList.

Tested with llvm-3.9, llvm-4.0, llvm-svn.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-17 11:24:46 -05:00
Chih-Wei Huang
bfc0c23843 Android: correct libz dependency
Commit 6facb0c0 ("android: fix libz dynamic library dependencies")
unconditionally adds libz as a dependency to all shared libraries.
That is unnecessary.

Commit 85a9b1b5 introduced libz as a dependency to libmesa_util.
So only the shared libraries that use libmesa_util need libz.

Fix Android Lollipop build by adding the include path of zlib to
libmesa_util explicitly instead of getting the path implicitly
from zlib since it doesn't export the include path in Lollipop.

Fixes: 6facb0c0 "android: fix libz dynamic library dependencies"

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-17 14:04:18 +01:00
Timothy Arceri
f96edf72b4 mesa: add KHR_no_error support for glDispatchCompute*()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
d1894c42ef mesa: add DispatchCompute* helpers
These will be used to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
07e14d561c mesa: move FLUSH_CURRENT() calls out of DispatchCompute*() validation
This is required to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
f98411eaad mesa: compute.c C99 tidy up
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
64757e73de mesa: move DispatchCompute() validation to compute.c
This is the only place it is used so there is no reason for it to be
in api_validate.c

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
25591adc28 mesa: add KHR_no_error support for glBlendEquationSeparateiARB()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
9aff3c605b mesa: add blend_equation_separatei() helper
Will be used to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
5c8252ba6f mesa: add KHR_no_error support for glBlendFunc*iARB()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:04 +10:00
Timothy Arceri
b5c67f469a mesa: add blend_func_separatei() helper
This will be used to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
b3888b7a68 mesa: add KHR_no_error support for glBufferSubData()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
7ec12293be mesa: add KHR_no_error support for glNamedBufferSubData()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
cab148c282 mesa: add buffer_sub_data() helper
This will allow us to share code between the dsa, non-dsa and
no_error variants.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
5d29e6aab8 mesa: create validate_buffer_sub_data() helper
This change assumes meta will always pass valid arguments to
_mesa_buffer_sub_data().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
624dc2833e mesa: add KHR_no_error support for glBufferStorage()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
cdbfb19420 mesa: add KHR_no_error support for glNamedBufferStorage()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
09687c2282 mesa: add inlined_buffer_storage() helper
This will allow us to share code between the dsa, non-dsa and
no_error variants.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
70d4d1164e mesa: add validate_buffer_storage() helper
This will allow use to add KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
6c8964bf63 mesa: add KHR_no_error support for glCompressedTex*SubImage3D()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
d1033cd1eb mesa: add 3D support to compressed_tex_sub_image() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
7cc190aae8 mesa: add KHR_no_error support for glCompressedTex*SubImage2D()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
1b36aa02b0 mesa: add 2D support to compressed_tex_sub_image() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
d5e382e316 mesa: add KHR_no_error support for CompressedTex*SubImage1D()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
cb5627cbac mesa: add compressed_tex_sub_image() helper
This reduces duplication between the dsa and non-dsa function
and will also be used in the following commit to add
KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
f1e692b452 mesa: make _mesa_compressed_texture_sub_image() static
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
3336d248e8 mesa: add KHR_no_error support for NamedFramebufferTexture
V3: use frame_buffer_texture() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
4e8aa4b9a2 mesa: add KHR_no_error support for FramebufferTexture
V3: use the frame_buffer_texture() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
f6229284e2 mesa: add *FramebufferTexture() support to frame_buffer_texture helper
V2: call check_layered_texture_target() even for no_error

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
4e125c4da6 mesa: add KHR_no_error support for NamedFramebufferTextureLayer
v3: use frame_buffer_texture_layer() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
e75e8d6c94 mesa: add KHR_no_error support for FramebufferTextureLayer
V3: use frame_buffer_texture_layer() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
f6198e9146 mesa: add no error support to frame_buffer_texture_layer() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
01081c6ee4 mesa: add frame_buffer_texture_layer() helper
To be used to add KHR_no_error support while sharing code between
the DSA and non-DSA OpenGL function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
70aa66f181 mesa: add KHR_no_error support for glUseProgram
V3: use always_inline attribute (Suggested by Nicolai)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Timothy Arceri
35a9b9a70c mesa: move use_program() inside _mesa_use_program()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-17 10:12:03 +10:00
Jason Ekstrand
e0d6f9afba intel/isl/gen6: Fix combined depth stencil alignment
All combined depth stencil buffers (even those with just stencil)
require a 4x4 alignment on Sandy Bridge.  The only depth/stencil buffer
type that requires 4x2 is separate stencil.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand
74d626f383 intel/isl: Refactor gen8_choose_image_alignment_el
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand
2486c7dd54 intel/isl: Refactor gen6_choose_image_alignment_el
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Jason Ekstrand
715f47cb34 intel/isl: Refactor gen7_choose_image_alignment_el
The Ivy Bridge PRM provides a nice table that handles most of the
alignment cases in one place.  For standard color buffers we have a
little freedom of choice but for most depth, stencil and compressed it's
hard-coded.  Chad's original functions split halign and valign apart and
implemented them almost entirely based on restrictions and not the
table.  This makes things way more confusing than they need to be.  This
commit gets rid of the split and makes us implement the exact table
up-front.  If our surface isn't one of the ones in the table then we
have to make real choices.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Pohjolainen, Topi
236f17a9f7 intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4
The reasoning Chad gave in the comment for choosing a valign of 4 is
entirely bunk.  The fact that you have to multiply pitch by 2 is
completely unrelated to the halign/valign parameters used for texture
layout.  (Not completely unrelated.  W-tiling is just Y-tiling with a
bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled
chunks so it makes everything easier if miplevels are always aligned to
8x8.)  The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet
doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you
can't do stencil texturing anyway.

v2 (Jason Ekstrand):
 - Delete most of Chad's comment and add a more descriptive commit
   message.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-16 17:04:26 -07:00
Rob Clark
dafc2f1887 freedreno/gmem: fix hw binning hangs with large render targets
On all 3 gens, we have 4 bits for width and height in the VSC pipe
config.  And overflow results in setting width and/or height to zero
which causes hangs.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-16 16:34:21 -04:00
Rob Clark
da9a1cb8a6 freedreno/ir3: fix crash with atomics
Atomics can have a result value.  And sometimes it is even used.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-16 16:34:21 -04:00
Rob Clark
8b4588b090 mesa/st: fix yuv EGLImage's
Don't reject YUV formats that the driver doesn't handle natively, since
mesa/st already knows how to lower this in shader.

Reported-by: Nicolas Dechesne <ndec@linaro.org>
Fixes: 83e9de2 ("st/mesa: EGLImageTarget* error handling")
Cc: 17.1 <mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Nicolas Dechesne <ndec@linaro.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-16 16:34:21 -04:00
Rob Clark
12aa1d15d5 ttn: fix dest size for some texture instructions
Some, like lod, don't return 4 components.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-16 16:34:21 -04:00
Rob Clark
2216a95946 ttn: fix txd src sizes
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-16 16:34:21 -04:00
Rob Clark
b00fbb7daf ttn: fix txs dest size
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-16 16:34:21 -04:00
Rob Clark
1303afdd4f freedreno/a5xx: remove unneeded assert
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-16 16:34:21 -04:00
Rob Clark
9235ab6550 freedreno/a5xx: fallback to slow-clear for z32
We probably *could* do this with blit path, but I think it would involve
clobbering settings from batch->gmem (see emit_zs()).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-16 16:34:21 -04:00
Philipp Zabel
cb16d91034 etnaviv: increment the resource seqno in resource_changed
Just increment the resource seqno instead of setting the texture
seqno to be lower by one than the resource seqno.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-16 21:07:56 +02:00
Lucas Stach
ba0b7de7e3 etnaviv: clean up sampler view reference counting
Use the proper pipe_resource_reference function instead of
rolling our own.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-16 21:07:51 +02:00
Lucas Stach
f8a3991458 etnaviv: apply feature overrides in one central location
This way we can just test the feature bits and don't need to spread
the debug overrides to all locations touching a feature.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-16 21:07:46 +02:00
Lucas Stach
20ce6f1361 etnaviv: allow R/B swapped surfaces to be cleared
Fixes: 7f62ffb68a ("etnaviv: add support for rb swap")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-16 21:07:42 +02:00
Lucas Stach
8173d7d9e8 etnaviv: stop oversizing buffer resources
PIPE_BUFFER is a target enum, not a binding. This caused the driver to
up-align the height of buffer resources, leading to largely oversizing
those resources. This is especially bad, as the buffer resources used
by the upload manager are already 1MB in size. Height alignment meant
that those would result in 4 to 8MB big BOs.

Fixes: c9e8b49b88 ("etnaviv: gallium driver for Vivante GPUs")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-16 21:07:37 +02:00
Matt Turner
169e1e26ee i965: Fix test_eu_validate.cpp
Broken by commit a7217e909c ("i965: Pass pointer and end of assembly
to brw_validate_instructions").

Reported-by: Aaron Watry <awatry@gmail.com>
2017-05-16 11:45:07 -07:00
Jason Ekstrand
b5437fc05c anv: Implement VK_KHR_get_surface_capabilities2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:46 -07:00
Jason Ekstrand
59f75dc2a4 vulkan/wsi/wayland: Add support for VK_KHR_get_surface_capabilities2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:45 -07:00
Jason Ekstrand
56901c9ea4 vulkan/wsi/x11: Add support for VK_KHR_get_surface_capabilities2
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:43 -07:00
Jason Ekstrand
a28163db05 vulkan/wsi: Add get_capabilities2 and get_formats2d interface pointers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:39 -07:00
Jason Ekstrand
52e6271ffd vulkan/wsi: Use vk_outarray for surface_get_formats
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:38 -07:00
Jason Ekstrand
c58f8bb56b vulkan: Update registry and headers to 1.0.49
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-16 08:38:34 -07:00
Nicolai Hähnle
c485b47383 radeonsi: extract TGSI memory/texture opcode handling into its own file
It's about time to get the growth of si_shader.c somewhat under control.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:55 +02:00
Nicolai Hähnle
cd9504667b radeonsi: make const_array externally accessible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:54 +02:00
Nicolai Hähnle
f0066eb57e radeonsi: make get_bounded_indirect_index externally accessible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:54 +02:00
Nicolai Hähnle
9252638afa radeonsi: make emit_waitcnt externally accessible
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:54 +02:00
Nicolai Hähnle
3811730a37 radeonsi: silence a Coverity warning
Coverity doesn't understand that we'll never pass non-NULL for vertex
shaders.

This is a bit lame, actually. A straightforward cross-procedural analysis
limited to this source file should be enough to prove that there's no
NULL-pointer dereference. Oh well.

CID: 1405999
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:54 +02:00
Nicolai Hähnle
4ea67c1751 radeonsi: rename tcs_tes_uses_prim_id for clarity
What we care about is whether PrimID is used while tessellation is
enabled; whether it's used in TCS/TES or further down the pipeline is
irrelevant.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:54 +02:00
Nicolai Hähnle
f4dbe2efb7 radeonsi: fix gl_PrimitiveIDIn in geometry shader when using tessellation
This builds on commit 0549ea15ec ("radeonsi: fix primitive ID in
fragment shader when using tessellation").

Fixes piglit
arb_tessellation_shader/execution/gs-primitiveid-instanced.shader_test

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:53 +02:00
Nicolai Hähnle
3accda4b82 ac/debug: handle index field in SET_*_REG correctly
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-16 16:11:53 +02:00
Samuel Pitoiset
0ca5bdb330 glsl: simplify link_assign_uniform_storage() a bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-16 09:33:06 +02:00
Samuel Pitoiset
8080082ad7 mesa: unify _mesa_uniform() for image uniforms
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-16 09:33:04 +02:00
Samuel Pitoiset
3c95a4fd4c mesa: fix indentation in _mesa_uniform()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-16 09:33:02 +02:00
Samuel Pitoiset
989def73ec mesa: fix indentation in _mesa_associate_uniform_storage()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-16 09:32:59 +02:00
Timothy Arceri
25bb02d7a0 mesa: replace _mesa_problem() with unreachable() in pack.c
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 12:25:50 +10:00
Timothy Arceri
51486d3369 mesa: replace _mesa_problem() with unreachable() in mipmap.c
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 12:25:50 +10:00
Timothy Arceri
1bd692b946 mesa: replace _mesa_problem() with unreachable() in _mesa_convert_colors()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 12:25:49 +10:00
Timothy Arceri
4c1664ff08 mesa: replace _mesa_problem() with unreachable() in _mesa_light()
All drivers but the old nouveau dri driver return after this anyway.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 12:25:49 +10:00
Timothy Arceri
59b9544fa7 mesa: replace _mesa_problem() with assert() in hash table
There should be no way the OpenGL test suites don't hit the assert()
should we do something to cause this code path to be taken.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 12:25:49 +10:00
Timothy Arceri
f25b2f76b0 mesa: don't crash in KHR_no_error uniform variants when location == -1
From Seciton 7.6 (UNIFORM VARIABLES) of the OpenGL 4.5 spec:

  "If the value of location is -1, the Uniform* commands will
  silently ignore the data passed in, and the current uniform values
  will not be changed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-16 11:53:16 +10:00
Matt Turner
b1af896853 intel/aubinator_error_decode: Disassemble shader programs
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 12:04:04 -07:00
Matt Turner
23685f07d1 intel/aubinator_error_decode: Stop decoding after MI_BATCH_BUFFER_END
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:43:20 -07:00
Matt Turner
8e7221fa5a intel/tools: Refactor gen_disasm_disassemble() to use annotations
Which will allow us to print validation errors found in shader assembly
in GPU hang error states.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:43:14 -07:00
Matt Turner
aaa0329b5f intel/decoder: Fix indentation
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-15 11:43:13 -07:00
Matt Turner
3443bd45a3 genxml: Remove brackets from kernel start pointer names
Newer Gens' names don't have the brackets. Having common names will make
some later patches simpler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-15 11:43:11 -07:00
Matt Turner
aae2626be8 i965: Add a weak no-op nir_print_instr() symbol
intel_asm_annotation.c is part of libintel_compiler.la, which contains
code for disassembling and validating shaders that we want to call in
aubinator_error_decode.

dump_assembly() calls nir_print_instr() to print annotations, and
although dump_assembly() is not called by aubinator_error_decode (nor is
any function in intel_asm_annotation.c) it causes undefined references
to nir_print_instr().

To work around, provide a no-op weak symbol to resolve against.
2017-05-15 11:43:01 -07:00
Matt Turner
d98e82c772 i965: Allow brw_eu_validate to handle compact instructions
This will allow the validator to run on shader programs we find in the
GPU hang error state.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:42:56 -07:00
Matt Turner
a7217e909c i965: Pass pointer and end of assembly to brw_validate_instructions
This will allow us to more easily run brw_validate_instructions() on
shader programs we find in GPU hang error states.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-15 11:42:47 -07:00
Matt Turner
8ca8ebbf78 i965: Mark shader programs for capture in the error state.
When the GPU hangs, the kernel saves some state for us. Until now it has
not included the shader programs, which are very often the reason the
GPU hang occurred. With the programs saved in the error state, we should
be more capable of debugging hangs.

Thanks to Chris Wilson and Ben Widawsky who provided the kernel support
for this feature ("drm/i915: Copy user requested buffers into the error
state"), which will be in kernel v4.13.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-15 11:42:41 -07:00
Tapani Pälli
0aa578714e egl: fix android logger compilation
1ce5853 broken compilation since LOG_ERROR is not defined and also
macro expansion won't work as planned (expands to 'ANDROID_egl2alog[level]')

v2: append 'ANDROID' to egl2alog table and use LOG_PRI
    (suggested by Chih-Wei Huang)

Fixes: 1ce5853 ("egl: simplify the Android logger")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-15 16:03:51 +01:00
Lionel Landwerlin
6d2e912cdb i965: perf: fix pointer to integer cast
v2: Just use cast to uintptr_t (Chris)

Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-15 14:06:15 +01:00
Lionel Landwerlin
55be6653e0 intel: gen-decoder: fix xml parser leak
In the unlikely case the parsing of genxml files fails, we were
leaking an xml parser object.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-15 14:06:11 +01:00
Marek Olšák
1c8f7d3be6 radeonsi: enable threaded_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
e24d094d70 gallium/u_threaded: drop and ignore all non-async debug callbacks
This is necessary to comply with OpenGL.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-15 13:01:33 +02:00
Marek Olšák
4c98afb241 gallium/radeon: add threaded context counter monitoring for HUD
"tc" will be initialized by the next commit.

v2: rename stuff according to v2 changes in u_threaded_context

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
7166773f90 radeonsi: implement replace_buffer_storage for the threaded context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
04299f7e5d gallium/radeon: subclass and handle threaded_query
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
b40d8026fa gallium/radeon: subclass threaded_transfer
v2: use assert on rtransfer->b.staging

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
b4fc399c08 gallium/radeon: subclass threaded_resource
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
93d549b2af gallium/radeon: handle other map buffer flags from the threaded context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
e11f7e1d59 gallium/radeon: handle TC_TRANSFER_MAP_THREADED_UNSYNC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
8b5485957e gallium/radeon: unwrap a context if we get a wrapped one
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
42fe45b451 gallium/radeon: require both WRITE and FLUSH_EXPLICIT in buffer_flush_region
spotted randomly.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
b8e552424e gallium/util: add threaded_context as a pipe_context wrapper
v2: - rename num_calls -> num_call_slots (for tc_call)
    - rename num_calls -> num_total_call_slots (for tc_batch)
    - rename num_offloaded/direct_calls -> num_offloaded/direct_slots
    - declare slot[0] instead of slot[1]
    - remove no-op leftover code from tc_draw_vbo
    - use tc_set_resource_reference to fill threaded_transfer
    - fix map flags for sparse buffers
    - cosmetic changes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
dca19b1d42 gallium/u_upload: add u_upload_clone
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
8559fa505d gallium: add flag PIPE_CONTEXT_PREFER_THREADED
State trackers can set this to tell the driver when u_threaded_context is
desirable.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-15 13:01:33 +02:00
Marek Olšák
7622181cad radeonsi/gfx9: add support for Raven
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-15 13:00:26 +02:00
Marek Olšák
efdb378c36 amd/addrlib: import Raven support
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-15 13:00:26 +02:00
Eric Anholt
c98f03c6eb renderonly: Initialize fields of struct winsys_handle.
vc4 was rejecting renderonly's import, because the offset field was
nonzero.

Fixes: 848b49b288 ("gallium: add renderonly library")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-15 06:38:45 +02:00
Rob Clark
12f9fa564a Revert "freedreno: use bypass if only clears"
Causing issues with stk on a4xx.. still probably a good idea, but seems
some debugging is needed first.

This reverts commit 3ab072d3c8.
2017-05-14 15:10:08 -04:00
Rob Clark
e4ad86952a freedreno: fix crash when flush() but no rendering
If we haven't created a batch, just bail in pipe->flush(), since there
is nothing to do.

Fixes crash in warsow, which creates a whole bunch of contexts used for
nothing but texture uploads.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-14 15:10:08 -04:00
Rob Clark
06a51fb4e5 freedreno: fix indexbuffer upload
My fault for not having time to test Marek's patches while they were on
list.

Fixes: 330d0607 ("gallium: remove pipe_index_buffer and set_index_buffer")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-14 15:10:08 -04:00
Bas Nieuwenhuizen
d4e4c36c7c radv: Save descriptor set even if vertex buffers are not saved.
Totally independent.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 0e6d532d32 "radv/meta: add support for save/restore meta without vertex data."
2017-05-13 23:05:25 +02:00
Rob Clark
8efaae3e19 freedreno/a5xx: hw binning support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-13 13:25:26 -04:00
Rob Clark
c61417e8be freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-13 13:12:36 -04:00
Rob Clark
3ab072d3c8 freedreno: use bypass if only clears
Some things trigger batches that only contain a clear (like glmark2
startup).  No point to use GMEM for this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-13 13:12:36 -04:00
Pierre Moreau
840f6beb81 nv50/ir: Report wrong prog types using proper var
Coverity caught the use of the uninitialised variable `type`.
However, it was `info->type`, which is initialised, which was meant to
be used.

CID: 1406000
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: b490ca9a38 ("nv50/ir: Fail if encountering unknown shader type")
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-13 16:26:09 +02:00
Timothy Arceri
812ff333bf mesa: fix KHR_no_error SSO support
Fixes: 00c5119a5e ("mesa: add KHR_no_error support for glUseProgramStages()")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-13 20:56:34 +10:00
Andres Gomez
752a6384af docs: update calendar, add news item and link release notes for 17.0.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-05-13 02:20:32 +03:00
Andres Gomez
05e833dfda docs: add sha256 checksums for 17.0.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 6a680243fc)
2017-05-13 02:16:11 +03:00
Andres Gomez
a2ca97eaca docs: add release notes for 17.0.6
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 08abf3a2a2)
2017-05-13 02:16:08 +03:00
Andres Gomez
b7af0ddfef bin/get-fixes-pick-list.sh: bring back the warning
We warn again if there are more than one line with the "fixes:" tag.

The warning is silenced when the commit has already landed or each
fixes tag reference a commit that is in branch.

v2:
 - Warn if any of the fixes tags has not landed (Emil)

v3:
 - Remove unnecessary head command
 - Clarify commit message (Emil)
 - Skip already picked commits sooner (Emil)

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-13 00:52:34 +03:00
Andres Gomez
0dead448dd docs: extend until the end of August
Completed the 17.1 cycle and added the beginning of the 17.2 one.

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-13 00:00:41 +03:00
Andres Gomez
19db5072aa docs: update "Release manager" column
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-13 00:00:34 +03:00
Nicolai Hähnle
5c92b1bf07 glsl: include image qualifiers when printing IR
v2:
- fix copy&paste errors noted by Samuel
- rebase

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-12 10:46:07 +02:00
Nicolai Hähnle
a16ae77185 radeonsi: get rid of secondary input/output word
By keeping track of fewer generics, everything can fit into 64 bits.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:06 +02:00
Nicolai Hähnle
0dd8aa44b3 radeonsi: reduce the number of generics for shader IO unique indices
This is a high as possible while still allowing to merge the bitfields
with the next commit.

For OpenGL, 32 would be sufficient. Nine apparently uses (much!) higher
indices than. Indices that are out of bound don't hurt for VS-PS
pipelines, except that the VS output kill optimization is not applied.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:06 +02:00
Nicolai Hähnle
90339fabd7 radeonsi: at most 8 sets of texture coordinates are supported
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:05 +02:00
Nicolai Hähnle
cfe6e30f1b radeonsi: skip generic out/in indices without a shader IO index
OpenGL uses at most 32 generic outputs/inputs in any stage, and they always
have a shader IO index and therefore fit into the outputs_written/
inputs_read/kill_outputs fields.

However, Nine uses semantic indices more liberally. We support that
in VS-PS pipelines, except that the optimization of killing outputs
must be skipped.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:05 +02:00
Nicolai Hähnle
7091fe887b radeonsi: use SI_MAX_IO_GENERIC instead of magic values
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-12 10:46:04 +02:00
Samuel Pitoiset
4aa4e17f4e glsl: order indices for images inside a struct array
ARB_bindless_texture allows images to be declared inside
structures. This is similar to samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-12 10:30:37 +02:00
Samuel Pitoiset
f87416f62d glsl: add parcel_out_uniform_storage::set_opaque_indices() helper
In order to sort indices for images inside a struct array we
need to do something similar to samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-12 10:30:30 +02:00
Rafael Antognolli
70251e3631 i965: Port 3DSTATE_VF_TOPOLOGY on gen8+ to genxml.
With this last state ported, we can get rid of gen8_draw_upload.c.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-11 21:27:38 -07:00
Rafael Antognolli
5bbcbabd86 i965: Port 3DSTATE_INDEX_BUFFER to genxml.
Also make the brw_get_index_type() function not shift its return, since that
is genxml's job now.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-11 21:27:38 -07:00
Rafael Antognolli
71bfb44005 i965: Port brw_cs_state tracked state to genxml.
Emit the respective commands using genxml code.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-11 21:27:38 -07:00
Rafael Antognolli
d9b4a81672 genxml: Add alias for MOCS.
Use an alias for this field on 3DSTATE_INDEX_BUFFER on gen6+, so we can set
the same value as the defines.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-11 21:27:38 -07:00
Rafael Antognolli
c93b17be19 i965/genxml: Mostly style fixes for emit_vertices code.
Several issues were caught on review after the original patch landed.
This commit fixes them.

v2:
   - Fix padding (Topi)
   - Remove .DestinationElementOffset change from this patch (Topi)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-11 21:27:38 -07:00
Glenn Kennard
fa105214d3 r600g: Add defines for per-shader engine settings
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-05-12 12:20:04 +10:00
Glenn Kennard
123ae18f29 r600g: Add instruction encoding defines for MEM_RD
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-05-12 12:19:55 +10:00
Glenn Kennard
8260c4648a r600g: Add scratch ring register defines
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
2017-05-12 12:19:46 +10:00
Kenneth Graunke
b361af6bbe i965: Drop brw_context::viewport_transform_enable.
This was used by the meta fast clear code.  Now that we've switched
back to BLORP, it's always true.

We might want it back when we add a RECTLIST extension to GL, but
that's someday in the future...

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-05-11 16:53:28 -07:00
Kenneth Graunke
f790d6e0b4 i965: Port Gen4-5 VS_STATE to genxml.
It's actually not that much code.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-11 16:52:59 -07:00
Kenneth Graunke
4933c3d16e i965: Change GEN_GEN < 7 to GEN_GEN == 6 in 3DSTATE_VS code.
This whole code is surrounded in #if GEN_GEN >= 6, and this code only
applies on Sandybridge.  So, use GEN_GEN == 6 to reduce the delta in
the next patch, when we add Gen4-5 support.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-11 16:34:04 -07:00
Kenneth Graunke
d65e19f5c6 genxml: Fix KSPs on Ironlake to be offsets, not pointers.
We use Instruction State Base Address on Ironlake, so we want KSP to be
an offset not an actual pointer.  Gen4/G45 use pointers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-11 16:33:48 -07:00
Samuel Pitoiset
4ad5fa617c glsl: simplify set_opaque_binding()
While we are at it, update the GLSL spec comment.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-11 21:08:32 +02:00
Samuel Pitoiset
a29810e27c glsl: add missing check for samplers in set_opaque_binding()
Like images, this prevents out-of-bound access when the explicit
binding layout qualifier is used with an array which contains
too much samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-11 21:08:28 +02:00
Samuel Pitoiset
be7a9066d9 mesa: remove useless get_uniform_parameter() declaration
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-11 21:08:24 +02:00
Samuel Pitoiset
7a37f5ade6 mesa: remove unused gl_program_parameter::Initialized
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-11 21:08:21 +02:00
Marek Olšák
479e76bc1f gallium/tests: fix build after index buffer changes
for some reason, only scons can build these.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-11 17:11:22 +02:00
Emil Velikov
709468a808 configure: remove unneeded bits around libunwind handling
If libunwind is not found we'll fail at PKG_CHECK_MODULES, so the
follow-up check will be false. Additionally the AM_CONDITIONAL is not
used, so we can drop it.

Fixes: 3bcef6aa24 ("configure.ac: honour --disable-libunwind if the .pc file is present")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 14:02:07 +01:00
Emil Velikov
a68d306a1d docs/releasing: don't forget to update the calendar
Suggested-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 14:01:37 +01:00
Emil Velikov
832180554a docs: remove released versions from the calendar
v2: Remove Mesa 17.1.0 as well

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1)
2017-05-11 14:00:48 +01:00
Emil Velikov
4588912425 virgl: remove unused draw include
Driver does not use the gallium draw module.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-05-11 13:58:20 +01:00
Emil Velikov
88b8aaea3b radeon: automake: remove unneeded elf Cflags/Libs
No longer required as of commit d90bf4ef3e ("radeon: remove unused
radeon_elf_util.{c,h}")

v2: Add the required libelf link in src/amd/Makefile.common.am

Fixes: d90bf4ef3e ("radeon: remove unused  radeon_elf_util.{c,h}")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-05-11 13:58:20 +01:00
Emil Velikov
4c22b99953 anv: document that anv_gem_mmap returns MAP_FAILED on error
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-11 13:58:20 +01:00
Emil Velikov
1ce58536ec egl: simplify the Android logger
Drop the unsupported pre-JellyBean macros and use a simple egl2android
mapping. With this we loose the explicit abort() provided by LOG_FATAL,
although Mesa already already calls exit(1) in case of a fatal errors.

Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-11 13:58:20 +01:00
Rob Herring
bec1c13be2 Android: Drop linking libgcc
Including libgcc breaks on Android O (master). This doesn't appear to be
needed any more as both Android M and N have also been built w/o libgcc.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
d31a2b4d49 Android: Add LLVM support for Android O
Android O moves to LLVM 3.9 and also has some differences in header
dependencies as LLVM has moved to blueprint files. It seems libLLVMCore
was only needed for header dependencies, so we can drop that for O.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
26aee6f4d5 Android: rework LLVM build support
Currently, building with "mmma external/mesa3d" which builds all targets
and dependencies is broken for targets that require LLVM. This is due to
the build settings depending on MESA_ENABLE_LLVM. Instead of using a
conditional in the global Android.common.mk, make all the components that
need LLVM explicitly include the necessary build settings.

GALLIVM_CPP_SOURCES doesn't exist anymore, so remove that as well.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
e2ff12e919 Android: rework libelf dependencies
Add libelf as a library dependency rather than explicitly listing its
include paths. This should work for Android M and later which have the
necessary exported directories in libelf.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
06260da16e Android: drop LLVM support on Lollipop
Mesa no longer supports LLVM 3.5 for any targets we support.
Android-x86 adds support for llvmpipe which could work, but android-x86
for L is using mesa 11.0 anyway.

Dropping this support enables clean-up of libelf dependencies.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
4c0c3719dc Android: Add driver "all" option to enable all drivers
Add a driver string "all" so that if BOARD_GPU_DRIVERS is set to "all",
all the drivers are enabled in the build. This makes build testing all
drivers easier to maintain.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
3f097396a1 Android: push driver build details to driver makefiles
src/gallium/targets/dri/Android.mk contains lots of conditional for
individual drivers. Let's move these details into the individual driver
makefiles.

In the process, align the make driver conditionals with automake
(i.e. HAVE_GALLIUM_*).

Signed-off-by: Rob Herring <robh@kernel.org>
[Emil Velikov: add the radeon winsys for radeonsi]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
2a2dabe1c3 Android: remove needless conditional including of child makefiles
It is not necessary to filter driver and winsys directories based on the
list of enabled drivers. Selecting the included driver libraries or not is
sufficient to control what is built.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
88014bc023 Android: Fix swrast only build
A build of only swrast is broken as the Android EGL now depends on
libdrm as does GBM. While we could make EGL conditionally depend on
libdrm, we probably want to enable kms_dri winsys as well and that will
need libdrm enabled. So just always enable libdrm and simplify the
Android makefiles a bit.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
[Emil Velikov: drop related inline comment]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
1ef913aacf Android: amd/common: fix dependency on libmesa_nir
Building libmesa_amd_common fails with:

external/mesa/src/amd/common/ac_shader_info.c:23:10: fatal error: 'nir/nir.h' file not found
         ^

external/mesa/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found
         ^

libmesa_amd_common now depends on libmesa_nir, so add it as a dependency
and export the necessary directories.

Fixes: 224cf29 "radv/ac: add initial pre-pass for shader info gathering"
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
1082501979 Android: amd: use exported include dirs instead of explicit includes
Add exported include paths rather than explicitly adding the includes
in each user of the common AMD libs.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:21 +01:00
Rob Herring
4eec1cfa8e Android: remove remaining explicit libcxx includes
Explicitly including libcxx includes is not necessary at least on
Android M and later. It appears that libc++ was made the default in
commit "Make libc++ the default STL." in Android build system post L.
However, if L support is still needed, using "LOCAL_CXX_STL=libc++" is
the preferred way.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:20 +01:00
Mauro Rossi
f21454eaa5 Android: define required __STDC* macros as cflags
Necessary to fix the following radeonsi building errors:

In file included from external/mesa/src/gallium/drivers/radeonsi/si_blit.c:24:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:29:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:71:
In file included from external/llvm/include/llvm-c/Core.h:18:
In file included from external/llvm/include/llvm-c/ErrorHandling.h:17:
In file included from external/llvm/include/llvm-c/Types.h:17:
external/llvm/include/llvm/Support/DataTypes.h:49:3: error: "Must #define __STDC_LIMIT_MACROS before #including Support/DataTypes.h"
  ^
external/llvm/include/llvm/Support/DataTypes.h:53:3: error: "Must #define __STDC_CONSTANT_MACROS before "         "#including Support/DataTypes.h"
  ^
2 errors generated.

[Emil Velikov: add inline comment about the defines]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:20 +01:00
Mauro Rossi
7e907d8f7f Android: drop static linking of R600 LLVM libraries
Inspired by Chih-Wei Huang and Zhen Wu similar patches

Linking against llvm with both static and shared may be avoided,
provided that libLLVM shared library for device supports
whole static R600/AMDGPU libraries, necessary for radeonsi/amdgpu.

Complementary changes, limited to android external/llvm project
are necessary to correclty build libLLVM

Tested with marshmallow-x86 and nougat-x86 builds

Reviewed-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-11 13:52:20 +01:00
Philipp Zabel
0b31c3adc1 configure.ac: Fix help string for --disable-pwr8 configure option
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-11 09:43:38 +01:00
Timothy Arceri
6d7660cf4b mesa: remove _CurrentFragmentProgram from gl_pipeline_object
This was added in b527dd65c8 as a work around because fixed function
fragment shaders were tracked in ctx->FragmentProgram._Current as
a gl_program rather than gl_shader_program.

However after my refactoring of the program and shader structs
at the end of 2016 which culminated in c505d6d852, we no longer
need gl_shader_program to track the current program making
_CurrentFragmentProgram obsolete.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-11 14:46:39 +10:00
Timothy Arceri
276166c45b mesa: add KHR_no_error support for FramebufferTexture*D functions
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
20cabc2ac0 mesa: add no error version of framebuffer_texture_with_dims()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
304058a1fb mesa: add error version of get_texture_for_framebuffer()
This is a step towards KHR_no_error support.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
69ca1ef683 mesa: pass rb attachment to _mesa_framebuffer_texture()
This change will help us add KHR_no_error support to the caller.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
d90ced445c mesa: add _mesa_get_and_validate_attachment() helper
Will be used to add KHR_no_error support. We make this available
external so it can be called from meta.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
786b9ad95b mesa: remove _mesa_problem() from a few locations
_mesa_problem() is still useful in some places such as is if a backend
compile fails, but for the majority of cases we should be able to
remove it.

OpenGL test suites are becoming very mature, we should place more
trust in debug builds picking up missed cases.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
8b00630c4d mesa: make _mesa_get_framebuffer_attachment_parameter() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
a754e4ca38 mesa: fix indentation
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Timothy Arceri
e618761233 mesa: remove _mesa from static framebuffer object function
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 13:53:39 +10:00
Michel Dänzer
0c67aa8456 gallivm: Fix build against LLVM SVN >= r302589
deregisterEHFrames doesn't take any parameters anymore.

Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-11 11:00:42 +09:00
Timothy Arceri
bdaff25c20 mesa: small _mesa_UseProgram() tidy up
Makes the code easier to follow.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 10:56:09 +10:00
Timothy Arceri
244cef1694 mesa: add KHR_no_error support for glBindProgramPipeline()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 10:56:08 +10:00
Timothy Arceri
0bca4784c2 mesa: add KHR_no_error support for glActiveShaderProgram()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 10:56:08 +10:00
Timothy Arceri
00c5119a5e mesa: add KHR_no_error support for glUseProgramStages()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 10:56:08 +10:00
Timothy Arceri
ea4c606441 mesa: create use_program_stages() helper
This will be used to create a KHR_no_error version of
glUseProgramStages().

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-11 10:56:08 +10:00
Dave Airlie
fe6c407a33 radv: handle fragment shader srgb resolve pass better
Bas pointed out the fs key doesn't take srgb into account,
since there is just one srgb variant, just create a separate
pipeline for it. This also uses dest format to be more consistent
on when srgb matters.

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-11 10:36:06 +10:00
Kenneth Graunke
32f0dc3a29 i965: Make INTEL_DEBUG=bat decode VS/CLIP/GS/SF/WM/CC_STATE on Gen4-5.
This is something the original decoder did, but I didn't bother with
until now.  I recently had to debug an Ironlake issue, and wanted to
inspect VS_STATE.  So, now it's back.

The other packets in the switch statement are all Gen6/7+, where we
use offsets from dynamic state base address, so we don't need the
gtt_offset subtraction introduced here.  We might want to make a
helper for this hack at some point - perhaps when we introduce the
next occurance.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-10 11:58:20 -07:00
Kenneth Graunke
0f34b674ed i965: Switch BRW_NEW_CURBE_OFFSETS to BRW_NEW_PUSH_CONSTANT_ALLOCATION.
The BRW_NEW_CURBE_OFFSETS dirty bit is signalled when changing the
partitioning of the Constant Buffer URB section between the various
shader stages, on Gen4-5.

BRW_NEW_PUSH_CONSTANT_ALLOCATION is basically the same thing on Gen7+.

So, save a bit, and use the new name.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-10 11:41:58 -07:00
Kenneth Graunke
608a65ebca i965: Drop BRW_NEW_PUSH_CONSTANT_ALLOCATION from Gen6 code.
Gen6 doesn't have a configurable push constant region.  This is only
used on Gen7+.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-10 11:41:50 -07:00
Kenneth Graunke
3d70e00c62 i965: Only #if...#endif a single function or related section at a time.
Previously we guarded large swathes of code with #if GEN ... #endif
blocks.  This made it difficult to see which generations include what.

This patch splits up the #if..#endif sections so they surround a small
section of code - usually a single function/atom, or sometimes a group
of related functions.  It should make the code easier to work on.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-10 11:41:46 -07:00
Kenneth Graunke
774db15aaf i965: Turn brw_get_line_width_float() into brw_get_line_width().
Drop the old brw_get_line_width() helper which return the unsigned
fixed-point encoding of the line width - it's been dead since the
conversion to GENXML (which does the encoding for us).

Then rename brw_get_line_width_float() to the shorter name.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-10 11:41:42 -07:00
Kenneth Graunke
620f12a53f i965: Drop INTEL_DEBUG=stats.
For whatever reason, we had an INTEL_DEBUG=stats option that enabled
various statistics counters on Gen4-5 systems.  It's been around
forever, though I can't think of a single time that it's been useful.

On Gen6+, we enable statistics all the time because they're necessary
to support various query object targets.  Turning them off would break
those queries.

Gen4-5 don't support those queries, so the statistics counters generally
aren't useful; we disabled them by default.  This patch disables them
altogether.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-10 11:37:19 -07:00
Kenneth Graunke
31abfd2d35 i965: Disable ARB_pipeline_statistics_query on Gen4-5.
We apparently enabled this on all platforms in Mesa 10.6.  However, it
was only ever implemented for Gen6+.  The Gen4-5 query code goes up in
flames with an "Unrecognized query target" unreachable() error if you
even attempt to use any of the new functionality.

This wasn't caught because the Piglit tests require OpenGL 3.0, which
Gen4-5 cannot support.  The extension spec does say 3.0 is required,
though I'm not sure why - it seems like 2.1 would work fine.

We could implement it anyway, but it's a little bit of a pain due to the
lack of hardware contexts (so we have to snapshot around batches).

Given that it's been 100% broken for two years and I haven't seen a bug
report about it, I'm not terribly inclined to care.  So, let it go.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-10 11:37:19 -07:00
Alex Deucher
2f0450c627 radeonsi: add new vega10 pci ids
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-10 13:41:38 -04:00
Marek Olšák
49c326420e st/mesa: move the logic of all_varyings_in_vbos into st_update_array
The function was pretty slow. This brings a substantial decrease in draw
call overhead when min/max index bounds are not needed:

Before:  DrawElements (1 VBO) w/ no state change:          5.75 million
After:   DrawElements (1 VBO) w/ no state change:          7.03 million

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
94506e5642 st/mesa: unify common code in st_draw_vbo functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
f60f14bdb3 st/mesa: make st_draw_vbo static
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
740ef228f7 radeonsi: remove upload code for zero-stride vertex attribs
st/mesa takes care of it now.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
17f776c27b st/mesa: upload zero-stride vertex attributes here
This is the best place to do it. Now drivers without u_vbuf don't have to
do it.

v2: use correct upload size and optimal alignment

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
70dcb7377d gallium: add PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEX
The next patch will use it. This is really for svga and GL2-level drivers.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
9db1f9bcd1 st/mesa: simplify the signature of get_client_array
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
e8b2274592 st/mesa: remove vpv->num_inputs dereferences in st_update_array
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
71fde49059 st/mesa: fold error handling into setup_(non_)interleaved_attribs
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
f4d272f6f6 st/mesa: fold cso calls into setup_(non_)interleaved_attribs
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 19:29:08 +02:00
Marek Olšák
c334c7dd75 st/mesa: don't call util_draw_init_info in st_draw_vbo 2017-05-10 19:00:16 +02:00
Marek Olšák
330d0607ed gallium: remove pipe_index_buffer and set_index_buffer
pipe_draw_info::indexed is replaced with index_size. index_size == 0 means
non-indexed.

Instead of pipe_index_buffer::offset, pipe_draw_info::start is used.
For indexed indirect draws, pipe_draw_info::start is added to the indirect
start. This is the only case when "start" affects indirect draws.

pipe_draw_info::index is a union. Use either index::resource or
index::user depending on the value of pipe_draw_info::has_user_indices.

v2: fixes for nine, svga
2017-05-10 19:00:16 +02:00
Marek Olšák
22f6624ed3 gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytes
For faster initialization of non-indirect draws.
2017-05-10 19:00:16 +02:00
Marek Olšák
c24c3b94ed gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytes 2017-05-10 19:00:16 +02:00
Emil Velikov
fe437882ea docs: add news item and link release notes for 17.1.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-10 15:24:03 +01:00
Emil Velikov
419be3a61f docs: add sha256 checksums for 17.1.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 806f802e7b)
2017-05-10 15:21:41 +01:00
Emil Velikov
c816c848e5 docs: Update 17.1.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 15a38605fc)
2017-05-10 15:21:39 +01:00
Samuel Pitoiset
de97e38290 st/glsl_to_tgsi: make sure resource file for samplers is PROGRAM_SAMPLER
Similar to how image resources are handled. That way we are sure
that inst->resource.file is PROGRAM_SAMPLER for "bound" samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 14:02:21 +02:00
Samuel Pitoiset
169888b55e radeonsi: silent a compiler warning
This fixes:

si_shader.c: In function ‘si_shader_dump_stats’:
si_shader.c:6704:31: warning: passing argument 1 of ‘si_get_max_workgroup_size’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
     si_get_max_workgroup_size(shader);
                               ^~~~~~
si_shader.c:5832:17: note: expected ‘struct si_shader *’ but argument is of type ‘const struct si_shader *’
 static unsigned si_get_max_workgroup_size(struct si_shader *shader)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 14:02:17 +02:00
Samuel Pitoiset
820966f9bc mesa: use u_bit_scan() in update_program_texture_state()
The check in update_single_program_texture() can also be
removed.

v2: - remove unused 's' variable

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-10 12:14:17 +02:00
Samuel Pitoiset
6a1f324e4a mesa: remove never used gl_shader_compiler_options::EmitNoFunctions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2017-05-10 12:10:50 +02:00
Nicolai Hähnle
362f8f6798 radeonsi: dump compute descriptor lists
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:53 +02:00
Nicolai Hähnle
30267256df radeonsi: dump both enabled and required descriptor slots
This allows a meaningful dump with info == NULL (for compute shaders).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:50 +02:00
Nicolai Hähnle
571597bf47 radeonsi: dump compute shader as part of debug dump
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:48 +02:00
Nicolai Hähnle
fbb2886634 radeonsi: move struct si_compute into a header
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:46 +02:00
Nicolai Hähnle
1a3bedd4b7 radeonsi: split descriptor list dumping
Prepare for dumping CS descriptor list.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:44 +02:00
Nicolai Hähnle
83f56e531d radeonsi: split shader dumping
Prepare for dumping compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:41 +02:00
Nicolai Hähnle
0282214c72 radeonsi: more const qualifiers in shader dump functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:39 +02:00
Nicolai Hähnle
db3559da12 ddebug: implement dd_dump_launch_grid
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:37 +02:00
Nicolai Hähnle
bf4ecfec4b ddebug: extract dd_dump_shader
Will be re-used for compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:34 +02:00
Nicolai Hähnle
fa1519d0c9 gallium/util: dump tokens in util_dump_shader_state only if type is TGSI
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:32 +02:00
Nicolai Hähnle
bcc37711cd gallium/util: add util_dump_grid_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-10 08:58:23 +02:00
Grazvydas Ignotas
45ccb661d8 radv: always free nir shaders from modules on stack
valgrind reports them as leaked, and I could not find anything making a
copy of the nir pointer. Also, radv_device_init_meta_blit_color() is
already freeing them unconditionally like this.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-10 01:13:44 +03:00
Grazvydas Ignotas
0ef302638f anv: don't leak DRM devices
After successful drmGetDevices2() call, drmFreeDevices() needs to be
called.

Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # radv version
2017-05-10 01:13:44 +03:00
Grazvydas Ignotas
e0aee8b667 anv: fix possible stack corruption
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.

Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-10 01:13:44 +03:00
Jason Ekstrand
037ce253b1 i965/vec4: Delete the system value infastructure
The only thing still using it is INVOCATION_ID for geometry shaders.
That's easily enough inlined into the nir_intrinsic_load_invocation_id
handling code.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:07 -07:00
Jason Ekstrand
2e9916ea04 i965/vec4: Use NIR to do GS input remapping
We're already doing this in the FS back-end.  This just does the same
thing in the vec4 back-end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:07 -07:00
Jason Ekstrand
e31042ab40 i965/fs: Move remapping of gl_PointSize to the NIR level
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand
5b00c3cc05 i965/nir: Inline remap_inputs_with_vue_map
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand
0d5f89cdc3 i965/vec4: Use NIR remapping for VS attributes
The NIR pass already handles remapping system values to attributes for
us so we delete the system value code as part of the conversion.

We also change nir_lower_vs_inputs to take an explicit inputs_read
bitmask and pass in the inputs_read from prog_data instead from pulling
it out of NIR.  This is because the version in prog_data may get
EDGEFLAG added to it on some old platforms.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:06 -07:00
Jason Ekstrand
80aa6e9d32 intel/compiler/vs: Move inputs_read handling to generic code
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:08:03 -07:00
Jason Ekstrand
d2fe804d18 i965/vec4: Set VERT_BIT_EDGEFLAG based on the VUE map
We also add a nice little comment to make it more clear exactly what
happens with the edge flag copy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
ca4d192802 i965/fs: Lower gl_VertexID and friends to inputs at the NIR level
NIR calls these system values but they come in from the VF unit as
vertex data.  It's terribly convenient to just be able to treat them as
such in the back-end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
24e6fba500 i965/vs: Set uses_vertexid and friends from brw_compile_vs
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
5e832302dc i965: Move multiply by 4 for VS ATTR setup into the scalar backend.
The vec4 backend will want to count in units of vec4s, not scalar
components.  The simplest solution is to move the multiplication by 4
into the scalar backend.  This also improves consistency with how we
count varyings.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
36764b6923 i965/nir: Inline remap_vs_attrs
Now that we have nice block iterators, there's no good reason for this
to be off on it's own.  While we're here, we convert to using the NIR
const index getters/setters instead of whacking const_index values
directly.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Jason Ekstrand
b86dba8a0e nir: Embed the shader_info in the nir_shader again
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer.  The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct.  This, however, has
caused a few problems:

 1) There are many things which generate NIR without GLSL.  This means
    we have to support both NIR shaders which come from GLSL and ones
    that don't and need to have an info elsewhere.

 2) The solution to (1) raises all sorts of ownership issues which have
    to be resolved with ralloc_parent checks.

 3) Ever since 00620782c9, we've been
    using nir_gather_info to fill out the final shader_info.  Thanks to
    cloning and the above ownership issues, the nir_shader::info may not
    point back to the gl_shader anymore and so we have to do a copy of
    the shader_info from NIR back to GLSL anyway.

All of these issues go away if we just embed the shader_info in the
nir_shader.  There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-09 15:07:47 -07:00
Kenneth Graunke
d4fa0a0fa6 mesa: Make _mesa_primitive_restart_index a static inline in the header.
It's now basically a single expression, so it probably makes sense to
have it inlined into the callers.

Suggested by Marek.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-09 12:07:34 -07:00
Rob Herring
db6f38cb6a freedreno: fix clang error in fd_get_compute_param
With commit 10c17f23b7 ("freedreno: core compute state support"),
Android builds fail with the following error:

external/mesa3d/src/gallium/drivers/freedreno/freedreno_screen.c:610:17: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security]
                        sprintf(ret, ir);
                                     ^~

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-09 14:20:54 -04:00
Rob Clark
c42952ea90 mesa/vbo: fix invalid min/max indexes
Fixes: c3f37e9b ("st/mesa: use min_index and max_index directly from vbo")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-05-09 14:20:24 -04:00
Lionel Landwerlin
32f14332f5 intel: compiler: prevent integer overflow
CID: 1399477, 1399478 (Integer handling issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:56:17 +01:00
Lionel Landwerlin
85182e490c intel: compiler: remove duplicated code
CID: 1399470: (Control flow issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:56:17 +01:00
Lionel Landwerlin
4201b7d1bf intel: gen decoder: don't check for size_t negative values
We should get either 0 or 1 here.

CID: 1373562 (Control flow issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-05-09 13:54:08 +01:00
Andres Gomez
bac80635af bin/*py: honor editorconfig formatting
Replace the two stray tabs with respective space.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-09 14:06:52 +03:00
Andres Gomez
d823440fed bin: use tabs for coding style on *.sh files
v2: Instead of changing *.sh, adapt the editorconfig file (Emil).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-09 14:05:00 +03:00
Mauro Rossi
7993823d38 android: i965: add per-gen libmesa_i965_gen{4,45,5} static
Needed to fix android building errors:

external/mesa/src/mesa/drivers/dri/i965/brw_state_upload.c:148: error: undefined reference to 'gen5_init_atoms'
external/mesa/src/mesa/drivers/dri/i965/brw_state_upload.c:150: error: undefined reference to 'gen45_init_atoms'
external/mesa/src/mesa/drivers/dri/i965/brw_state_upload.c:152: error: undefined reference to 'gen4_init_atoms'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: 5a19d0b ("i965: Get real per-gen atom lists")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-05-09 08:04:09 +03:00
George Kyriazis
909f72e0a2 swr: fix polygonmode for front==back
Rasterizer core only supports polygonmode front==back.  Add logic for
populating fillMode for the rasterizer only for that case correctly.
Provide enum conversion between mesa enums and core enums.

The core renders lines/points as tris. Previously, code would enable
stipple for polygonmode != FILL.  Modify stipple enable logic so that
this works correctly.

No regressions in vtk tests.
Fixes the following piglit tests:
	pointsprite
	gl-1.0-edgeflag-const

v2: remove cc stable, and remove "not implemented" assert
v3: modified commit message

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-08 21:28:53 -05:00
George Kyriazis
26a9ed6f0f swr/rast: support polygonmode point
Add support for polygonmode point in the binner.  This is done by
splitting BinPostSetupPoints from BinPoints, so the earlier call can be
called from BinTriangles.  Setup has already been done at the time
BinPostSetupPoints needs to be called.

This checkin just adds support in the rasterizer.  A separate checkin
will add the appropriate driver support.

v2: remove cc stable
v3: modified commit message and subject line

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-05-08 21:28:53 -05:00
Timothy Arceri
34c5e58a68 util: move ALWAYS_INLINE macro to util/macro.h
Also added clang check.

macro.h is include by p_compiler.h so no other change is needed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-09 11:21:03 +10:00
Bruce Cherniak
f52e63069a swr: move msaa resolve to generalized StoreTile
v3: list piglit tests fixed by this patch. Fixed typo Tim pointed out.
v2: Reword commit message to more closely adhere to community
guidelines.

This patch moves msaa resolve down into core/StoreTiles where the
surface format conversion routines are available.  The previous
"experimental" resolve was limited to 8-bit unsigned render targets.

This fixes a number of piglit msaa tests by adding resolve support for
all the render target formats we support.

Specifically:
layered-rendering/gl-layer-render: fail->pass
layered-rendering/gl-layer-render-storage: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg: crash->pass
multisample-formats *[2,4,8,16] gl_ext_texture_snorm: crash->pass
multisample-formats *[2,4,8,16] gl_arb_texture_float: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg-float: fail->pass

MSAA is still disabled by default, but can be enabled with
"export SWR_MSAA_MAX_COUNT=4" (1,2,4,8,16 are options)
The default is 0, which is disabled.

This patch improves the number of multisample-formats supported by swr,
and fixes several crashes currently in the 17.1 branch.  Therefore, it
should be considered for inclusion in the 17.1 stable release.  Being
disabled by default, it poses no risk to most users of swr.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
cc: mesa-stable@lists.freedesktop.org
2017-05-08 14:45:40 -05:00
Eric Anholt
0ffa06a19b glsl: Don't allow redefining builtin functions on GLSL 1.00.
The spec text cited above says you can't, but only the GLSL 3.00 (redefine
or overload) case was implemented.

Fixes dEQP scoping.invalid.redefine_builtin_fragment/vertex.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Matt Turner <mattst88@gmail.com>
2017-05-08 12:15:49 -07:00
Eric Anholt
79da0ed2fc glsl: Restrict func redeclarations (not just redefinitions) on GLSL 1.00.
Fixes DEQP's scoping.invalid.redeclare_function_fragment/vertex.

v2: Fix accidental rejection of prototype+decl.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
Tested-by: Matt Turner <mattst88@gmail.com>
2017-05-08 12:15:49 -07:00
Eric Anholt
e5ade7db73 glsl: Ban #undefining __LINE__ and friends on GLES2.
Fixes deqp_gles2 undefine_invalid_object_* failures.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Matt Turner <mattst88@gmail.com>
2017-05-08 12:15:49 -07:00
Eric Anholt
efa9750e96 glsl: Restrict functions to not return arrays or SOAs in GLSL 1.00.
From the spec,

    Arrays are allowed as arguments, but not as the return type. [...] The
    return type can also be a structure if the structure does not contain
    an array.

Fixes DEQP shaders.functions.invalid.return_array_in_struct_fragment.

v2: Spec cite wording change

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Matt Turner <mattst88@gmail.com>
2017-05-08 12:15:49 -07:00
Rob Clark
ae7aa8dbaf nir: fix (hopefully) windows build
Fixes: 53aa109b ("nir: add pass to lower atomic counters to SSBO")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-08 13:41:16 -04:00
Marek Olšák
25d246f454 radeonsi: rename si_eliminate_const_vs_outputs -> si_optimize_vs_outputs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 19:18:29 +02:00
Marek Olšák
34bc470fa6 ac: fix broken elimination of duplicated VS exports
The renumbering code didn't take into account that multiple VS exports
can have the same PARAM index. This also significantly simplifies
the renumbering. Thankfully, we have piglits for this:

    spec@arb_gpu_shader5@arb_gpu_shader5-interpolateatcentroid-packing
    spec@glsl-1.50@execution@interface-blocks-complex-vs-fs

Reported by Michel Dänzer.

Fixes: b08715499e ("ac: eliminate duplicated VS exports")
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 19:18:29 +02:00
Chad Versace
0160fb1d50 egl: Fix -Wint-to-pointer-cast
main/egldisplay.c: In function '_eglParseX11DisplayAttribList':
main/egldisplay.c:491:38: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
          display->Options.Platform = (void *)value;

The fix: cast to uinptr_t before void*.
                                      ^
Fixes: ddb99127 egl/x11: Honor the EGL_PLATFORM_X11_SCREEN_EXT attribute
Cc: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 13:10:16 -04:00
Marek Olšák
b84979d6c7 st/mesa: remove unused st parameter in init_velement_lowered
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:33:46 +02:00
Marek Olšák
d801247cec st/mesa: use PIPE_MAX_ATTRIBS as the max number of vertex buffers
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
57fc8ae61a st/mesa: simplify code due to unification to st_common_program
v2: use the st_common_program() helper

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
a159d6ed20 st/mesa: simplify update_constants functions
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
bb6e851a1e st/mesa: unify TCS, TES, GS st_*_program structures
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
7ca8b86cb9 st/mesa: decrease the size of remaining st_translate_program array params
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
88d46ac184 st/mesa: remove unused outputSlotToAttr
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
7c44810cc0 st/mesa: remove st_context::vertex_result_to_slot
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
d947e3e2c8 st/mesa: decrease the size of st_vertex_program
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-05-08 18:32:00 +02:00
Marek Olšák
d1ee2b37ff st/mesa: remove struct st_tracked_state
It contains only one member: the update function. Let's use the update
function directly.

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 18:32:00 +02:00
Nicolai Hähnle
cb2ac69628 radeonsi: split per-patch from per-vertex indices
Make it a bit clearer that the index spaces are logically seperate by
having them defined in different functions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle
707df19451 radeonsi: clarify documentation of existing SI workaround
Limiting LS-HS to a single wave is required on all SI chips due to an
issue with a power management feature.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle
f16b755863 radeonsi: fix gl_PrimitiveID in tessellation with instanced draws on SI
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle
b84b631c63 radeonsi: load patch_id for TES-as-ES when exporting for PS
For some reason, this change is only necessary on SI.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle
0549ea15ec radeonsi: fix primitive ID in fragment shader when using tessellation
In a VS->TCS->TES->PS pipeline, the primitive ID is read from TES exports,
so it is as if TES were using the primitive ID.

Specifically, this fixes a bug where the primitive ID is not reset at
the start of a new instance.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:17 +02:00
Nicolai Hähnle
854ed47f3e radeonsi: mark fast-cleared textures as compressed when dirtying
There are a bunch of piglit fast clear tests that regressed on SI, for
example ./bin/ext_framebuffer_multisample-fast-clear single-sample.

The problem is that a texture is bound as a framebuffer, cleared, and
then rendered from in a loop that loops through different clear colors.
The texture is never rebound during all this, so the change to
tex->dirty_level_mask during fast clear was not taken into account
when checking for compressed textures.

I have considered simply reverting the problematic commit. However,
I think this solution is better. It does require looping through all
bound textures after a fast clear, but the alternative would require
visiting more textures needless on every draw. Draws are much more
common than clears.

Note that the rendering feedback loop rules do not apply here, because
the framebuffer binding is changed between the glClear and the draw
that samples from the texture that was cleared.

Fixes: bdd6449769 ("radeonsi: don't mark non-dirty textures with CMASK as compressed")
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-08 17:42:16 +02:00
Emil Velikov
f12fcb1c9d egl: use designated initializers
All the compilers used to build Mesa support them.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:34:21 +01:00
Emil Velikov
54f619fb9b egl: drop unneeded sentinel from level_strings[]
The array is local so we already know its size.

v2: Correct loop condition (Bartosz)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:34:09 +01:00
Emil Velikov
239e7ee91b egl: remove suprous header eglcompiler.h
The header is used only to provide STATIC_ASSERT. The latter is already
available in utils/macros.h so use that instead and kill of the header.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:33:59 +01:00
Emil Velikov
8d6f92313d egl: remove unneeded else statement in _eglInitLogger
The variable level is already initialized to -1 which is already
interpreted as FALLBACK_LOG_LEVEL.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:33:56 +01:00
Emil Velikov
1dd038e988 egl: remove no longer needed logger infra
As of last commit nobody requires anything else but the
_eglDefaultLogger(). As such use it directly and simplify the
implementation.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:33:54 +01:00
Emil Velikov
0372097eec egl: fold Android logger into main/
Will allow us to greatly simplify a lot of the code in egllog.c

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:33:51 +01:00
Emil Velikov
716e5db610 egl: remove unused _eglSetLogLevel()
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-08 15:33:12 +01:00
Samuel Pitoiset
a3996590b8 glsl: apply the image format for members of structures
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 16:04:05 +02:00
Samuel Pitoiset
8a6ecde9c1 glsl: store the image format in glsl_struct_field
ARB_bindless_texture allows to declare image types inside
structures, which means we need to keep track of the format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-08 16:04:05 +02:00
Samuel Pitoiset
14187e1e9e st/glsl_to_tgsi: don't use rzalloc_array() when it's unnecessary
When the arrays are initialized later on with -1, that's useless
to use rzalloc_array().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-08 16:04:05 +02:00
Lionel Landwerlin
e3a5ab2d66 anv: check return value of anv_execbuf_add_bo
CID: 1405919 (Error handling issues)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-08 14:38:27 +01:00
Lionel Landwerlin
6247b8b413 anv: avoid null pointer dereference
The application might not give an output structure.

CID: 1405765 (Null pointer dereferences)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-08 14:38:27 +01:00
Eric Engestrom
dc795f85a5 egl: avoid dereferencing a null display
Fixes: ddb99127a6 ("egl/x11: Honor the EGL_PLATFORM_X11_SCREEN_EXT attribute")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-08 11:43:36 +01:00
Andres Gomez
9c70537a52 docs/releasing: added relevant people for build/check with MacOSX
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
2017-05-08 11:39:46 +03:00
Andres Gomez
2be0a99052 docs/releasing: added relevant people for build/check with Android
v2: Tapani as main contact and Mauro just for help with
    debugging/building (Mauro).

v3: Mauro my provide feedback for android-x86 only (Mauro).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 11:39:22 +03:00
Andres Gomez
029d7bebed docs/releasing: added relevant people for build/check with Windows
v2: Brian Paul as main contact point and Jose Fonseca as
    fallback (Vinson, Jose)

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Brian Paul <brianp@vmware.com>
Cc: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-05-08 11:39:20 +03:00
Andres Gomez
fcdc96d1fc docs/releasing: if possible, do some every day use on the RC
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 11:39:11 +03:00
Andres Gomez
8a3e33ae5d docs/releasing: further explain the build/check testing process
The build/check test should be done with an appropriate combination of
flags, depending on the changes introduced by the patch set.

Also, mention to cross compile with mingw-w64 for Windows.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 11:39:01 +03:00
Andres Gomez
8058707395 docs/releasing: check in master for forgotten nomination candidates
The maintanier should not just rely on the mesa-stable@ mailing list
but actually check the master branch in search for suitable nomination
candidates.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 11:38:53 +03:00
Andres Gomez
e0f7d25cf0 docs/releasing: format/style homogenization
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-08 11:38:39 +03:00
Andres Gomez
77306e2afc bin/get-fixes-pick-list.sh: don't warn if more than one, go over them
If an identified commit was having more than one fix, we would warn
about that and only treat the first.

Now, we don't warn but treat all of them.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-05-08 11:28:17 +03:00
Rafael Antognolli
df3b221016 i965: Update gen6_depth_stencil_state to use genX macro.
While moving depth stencil state to use genxml, this one was left
behind.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-07 21:06:05 -07:00
Rafael Antognolli
592d4387a3 i965: Move MOCS macros to brw_state.h.
brw_state.h is a better place to keep them, instead of brw_context.h.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-07 21:02:44 -07:00
Kenneth Graunke
bc074a4518 i965: Don't try to unmap NULL program cache BO.
When running shader-db with intel_stub and recent Mesa, context creation
fails when making a logical hardware context.  In this case, we call
intelDestroyContext(), which gets here and tries to unmap the cache BO.

But there isn't one - we haven't made it yet.  So we try to unmap a
NULL pointer, which used to be safe (it did nothing), but crashes
after commit 7c3b8ed878.

The result is that we crash rather than failing context creation with
a nice message.  Either way nothing works, but this is more polite.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-05-07 20:58:44 -07:00
Kenneth Graunke
1456da91c8 Revert "mesa: Require mipmap completeness for glCopyImageSubData(), sometimes."
This reverts commit c5bf7cb529.

This broke rendering in "Total War: WARHAMMER", which uses a single
level RGBA_UINT32 texture and the default filter modes of GL_LINEAR
and GL_NEAREST_MIPMAP_LINEAR.  However, the texture max level is 0,
so it is actually mipmap complete - it's the integer + linear rule
that causes the error.

I'm working with Khronos to find a real solution.  However it turns
out, this patch is not correct and breaks real programs, so let's
revert it for now.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100690
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16224
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-05-07 20:55:31 -07:00
Grazvydas Ignotas
1f743a0edf glsl: destroy function and subroutine hash tables
Just like other type hash tables are destroyed in
_mesa_glsl_release_types(), also destroy the ones for function and
subroutine types.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-08 13:18:23 +10:00
Dave Airlie
d71ca40a18 radv: fix regression in blit2d push constant change.
These were being fed to the shader as floats via the vertex
path, so also push them as floats here.

This fixes missing overlay in Sascha Willems demos.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 00:54:49 +01:00
Dave Airlie
bcf705b62e radv/meta: cleanup some unused code path
After moving everything to using push constants,
these paths are no longer needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:55 +10:00
Dave Airlie
387fdf84c5 radv/meta: port blit to using push constants
Remove use of vertex buffer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:52 +10:00
Dave Airlie
7c8bfb95c6 radv/meta: move blit2d to using push constants
This allows us to drop the vertex buffer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:49 +10:00
Dave Airlie
b29ea49e8e radv/meta: move clear color to using push constants
The color clear value is uniform and needs only to be emitted from
the frag shader, so just push it down via a push constant,
and remove the vertex buffer completely.

The depth clear value needs to be emitted from the vertex
shader, but is only a single value.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:45 +10:00
Dave Airlie
3b85b630ee radv/meta: use novertex save path for resolve pass.
This was missing in the original change.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:42 +10:00
Dave Airlie
eb2a833679 radv: set base/ranges for push constant loads.
This isn't necessary yet but I'd like to use the range in
some future patches.

[airlied: add new resolve pass]
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-08 08:56:36 +10:00
Dave Airlie
823e9ea8a1 radv: drop resolve hack workarounds
This drops the resolve workarounds that change an image
tiling mode behinds it's back, this is horrible and breaks
the image_view->image relationship. Remove all this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:39 +01:00
Dave Airlie
2a04f5481d radv/meta: select resolve paths
There are 3 resolve paths, the fastest being the hw resolver
but it has restriction on tile modes and can't do subresolves,
the compute resolver is next speed wise, but can't handle DCC
destinations, the fragment resolver handles that case.

This will end up with a slow down as currently we hack the
hw resolver paths when they shouldn't work, but we shouldn't
keep doing that.

The next patch removes the hacks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:39 +01:00
Dave Airlie
69136f4e63 radv/meta: add resolve pass using fragment/vertex shaders
In order to resolve into DCC enabled dests we need to use
the fragment shader. This reuses the code from the compute
path and implements a resolve path in vertex/fragment shader.

This code isn't used until later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:39 +01:00
Dave Airlie
19be95f71e radv: add subpass resolve compute path
This adds a path to allow compute resolves to be used
for subpass resolves.

This isn't used yet, but will be later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:39 +01:00
Dave Airlie
c573076d4a radv/resolve: split resolve emission out for compute
This will allow to add a subpass compute resolve path.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:38 +01:00
Dave Airlie
ff47866107 radv/meta: split out core part of resolve shader
I want to reuse the same code for the fragment shader
version of the resolve shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:38 +01:00
Dave Airlie
588185eb6b radv/meta: add srgb conversion to end of resolve shader.
If we are resolving into an srgb dest, we need to convert
to linear so the store does the conversion back.

This should fix some wierdness seen when we subresolves
hit the compute path.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:38 +01:00
Jose Fonseca
dab6a2dfd9 nir: Fix missing snprintf symbol on Windows.
Copy nir_print.c's snprintf definition for now, to unbreak Windows
builds.

We can and should cleanup all snprintf definitions in a follow up
change, but I rather not leave Windows build broken any further.

Trivial.
2017-05-07 19:23:07 +01:00
Pierre Moreau
27ad060c6e nv50/ir: Replace NV50_PROGRAM_IR_* by PIPE_SHADER_IR_*
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-07 10:26:37 -04:00
Pierre Moreau
8fe5949b08 nv50/ir: Remove unused translation methods
This code was merged commented out, and has stayed that way ever since.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-07 10:26:36 -04:00
Pierre Moreau
dd7ab4dcb4 nv50/ir: Free target if we failed to create a program
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-07 10:26:36 -04:00
Pierre Moreau
b490ca9a38 nv50/ir: Fail if encountering unknown shader type
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-07 10:26:36 -04:00
Dave Airlie
c297e68828 radv: set PERF_MOD in sample state like radeonsi.
This just aligns the code with radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 11:19:01 +01:00
Dave Airlie
2add79a732 radv: apply the tess+GS hang workaround to Polaris12 as well
As I pointed out for radeonsi, and AMD confirmed, so fix this
in radv as well.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 11:17:48 +01:00
Timothy Arceri
ccf9669cc1 mesa: small texture targetIndex tidy up
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:34 +10:00
Timothy Arceri
68cd0e2000 mesa: fix broken indentation
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:34 +10:00
Timothy Arceri
084fec0e77 mesa: some C99 tidy ups
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
f9e6820652 mesa: add KHR_no_error support to copy buffer subdata functions
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
5e86bfaee3 mesa: remove _mesa from static function
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
2a305fee1b st/mesa: stop calling _mesa_init_buffer_object_functions()
After calling this we were then overriding all the functions with
st versions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
123b113f95 mesa: make _mesa_buffer_storage() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
681647eca8 mesa: make _mesa_copy_buffer_sub_data() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
f9c28b9f87 mesa: make _mesa_clear_buffer_sub_data() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
426e4765d2 mesa: add KHR_no_error support for flush mapped buffer functions
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
30d8dea602 mesa: make _mesa_flush_mapped_buffer_range() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
bbae62c714 mesa: add KHR_no_error support for unmap buffer functions
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
0b2e4da80a mesa: split unmap_buffer() in two
This will allow us to implement KHR_no_error support for unmap
functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
6c3768692e mesa: make _mesa_unmap_buffer() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
9d010f57db mesa: add KHR_no_error support for some map buffer functions
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
e83b0a4103 mesa: split out validation from map_buffer_range()
This will allow us to add KHR_no_error support for *BufferRange
functions.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Timothy Arceri
8a1c36015b mesa: make map_buffer_range() static
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-07 15:29:33 +10:00
Kenneth Graunke
1151349c2a i965: Drop BRW_NEW_BLORP from 3DSTATE_VF atom.
BLORP doesn't program 3DSTATE_VF, since it doesn't use index buffers,
making the setting irrelevant.  So there's no need to re-emit it after
a BLORP operation - the old setting will still be in place.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-06 15:43:43 -07:00
Kenneth Graunke
6e2c39f562 i965: Port 3DSTATE_VF to genxml and simplify the implementation.
The whole "it might be used for non-indexed draws" thing is no longer
true - it turns out this was a mistake, and removed in OpenGL 4.5.
(See Marek's commit 96cbc1ca29e0b1f4f4d6c868b8449999aecb9080.)  So
we can simplify this and just program 0 for non-indexed draws.

We can also use #if blocks to remove the atom on Ivybridge/Baytrail,
now that they have a separate atom list from Haswell.  No more runtime
checks.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-05-06 15:43:43 -07:00
Kenneth Graunke
8c5a938171 mesa: Simplify _mesa_primitive_restart_index().
We can use a simple shift equation rather than a switch statement.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-06 15:43:43 -07:00
Marek Olšák
314657dc11 Revert "radeonsi: constify a bunch of the perfcounter structs."
This reverts commit 7088b655e8.

It breaks performance counters. If you use them with this commit, they hang
the machine hard. Sysrq and ssh don't work.
2017-05-06 21:17:52 +02:00
Marek Olšák
b0d01bd303 Revert "radeonsi: fix build with GCC 4.8"
This reverts commit 485ece83ac.

It's needed to revert 7088b655e8.
2017-05-06 21:17:52 +02:00
Rob Clark
6050d5bf3d freedreno/a3xx: fix hang w/ large render targets and small gmem
Possibly other gen's have a similar limit.  Fixes glmark2 -b shadow
with larger resolutions on devices with small gmem (for example,
fullscreen 1080p on 8x16/db410c).

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-06 14:16:33 -04:00
Rob Clark
4fadfbf176 freedreno/ir3: add macro to declare variable length arrays
We have enough of these, that we should stop open coding this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-06 14:15:42 -04:00
Nicolai Hähnle
b738fae4b9 glsl: skip tree grafting for sampler and image types
v2: - use is_sampler()/is_image() instead (Samuel Pitoiset)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
d04b0f31d3 glsl: teach lower_ubo_reference about samplers inside structures
In a situation like:

(tex vec4 (record_ref (var_ref f)  tex)  (constant vec2 (0.000000 0.000000))  0 1 () )

The sampler needs to be lowered, otherwise this ends up with
"ir_dereference_variable @ 0x229a100 specifies undeclared variable
`ubo_load_temp' @ 0x2290440"

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
d550024a7e glsl: link bindless layout qualifiers
From section 4.4.6 of the ARB_bindless_texture spec:

   "If both bindless_sampler and bound_sampler, or bindless_image
    and bound_image, are declared at global scope in any
    compilation unit, a link- time error will be generated."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
d6810ea286 glsl: do not count bindless samplers/images when linking uniforms
From section 2.14.8 of the ARB_bindless_texture spec:

    "(modify second paragraph, p. 126) ... against the
     MAX_COMBINED_TEXTURE_IMAGE_UNITS limit.  Samplers accessed
     using texture handles (section 3.9.X) are not counted against
     this limit."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
8b4c48673a glsl: lower bindless sampler/image packed varyings
v3: - rebase (and remove (sampler) ? 1 : vector_elements)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
3cdcc5f02f glsl: implement ARB_bindless_texture conversions
From section 5.4.1 of the ARB_bindless_texture spec:

   "In the following four constructors, the low 32 bits of the
    sampler type correspond to the .x component of the uvec2 and
    the high 32 bits correspond to the .y component."

    uvec2(any sampler type)     // Converts a sampler type to a
                                //   pair of 32-bit unsigned integers
    any sampler type(uvec2)     // Converts a pair of 32-bit unsigned integers to
                                //   a sampler type
    uvec2(any image type)       // Converts an image type to a
                                //   pair of 32-bit unsigned integers
    any image type(uvec2)       // Converts a pair of 32-bit unsigned integers to
                                //   an image type

v4: - fix up comment style
v3: - rebase (and remove (sampler) ? 1 : vector_elements)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
95c83aba71 glsl: allow bindless samplers/images to be used with constructors
For the explicit conversions.

From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers are represented using 64-bit integer handles, and
    may be converted to and from 64-bit integers using constructors."

From section 4.1.X of the ARB_bindless_texture spec:

   "Images are represented using 64-bit integer handles, and
    may be converted to and from 64-bit integers using constructors."

v3: - add spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v2)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
b98542588c glsl: add is_valid_constructor() helper function
This will help for the explicit conversions for sampler and
image types as specified by ARB_bindless_texture.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
1eff26f02d glsl: add ARB_bindless_texture operations
For the explicit pack/unpack conversions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
35c8e727a5 glsl: allow bindless samplers/images to be initialized
From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers may be declared as shader inputs and outputs, as uniform
    variables, as temporary variables, and as function parameters."

From section 4.1.X of the ARB_bindless_texture spec:

   "Images may be declared as shader inputs and outputs, as uniform
    variables, as temporary variables, and as function parameters."

v3: - add spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
efb668fb29 glsl: allow bindless samplers/images to be l-values
From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers can be used as l-values, so can be assigned into and
   used as "out" and "inout" function parameters."

From section 4.1.X of the ARB_bindless_texture spec:

   "Images can be used as l-values, so can be assigned into and
    used as "out" and "inout" function parameters."

v4: - invert the logic
v3: - update spec comment formatting
    - keep the read_only check

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
fa4ebf6b8d glsl: add _mesa_glsl_parse_state object to is_lvalue()
Yes, this is a bit hacky but we don't really have the choice.
Plain GLSL doesn't accept bindless samplers/images as l-values
while it's allowed when ARB_bindless_texture is enabled.

The default NULL parameter is because we can't access the
_mesa_glsl_parse_state object in few places in the compiler.
One is_lvalue(NULL) call is for IR validation but other checks
happen elsewhere, should be safe.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
42a2fe25f4 glsl: relax bindless sampler arrays indexing
From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers aggregated into arrays within a shader (using square
    brackets []) can be indexed with arbitrary integer expressions."

v3: - update spec comment formatting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
ece1c04e8e glsl: reject bindless samplers/images frag inputs without 'flat'
From section 4.3.4 of the ARB_bindless_texture spec

   "(modify last paragraph, p. 35, allowing samplers and images as
    fragment shader inputs) ... Fragment inputs can only be signed
    and unsigned integers and integer vectors, floating point scalars,
    floating-point vectors, matrices, sampler and image types, or
    arrays or structures of these.  Fragment shader inputs that are
    signed or unsigned integers, integer vectors, or any
    double-precision floating- point type, or any sampler or image
    type must be qualified with the interpolation qualifier "flat"."

v3: - update spec comment formatting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
8834c74fef glsl: allow bindless samplers/images as vertex shader inputs
From section 4.3.4 of the ARB_bindless_texture spec:

   "(modify third paragraph of the section to allow sampler and
    image types) ...  Vertex shader inputs can only be float,
    single-precision floating-point scalars, single-precision
    floating-point vectors, matrices, signed and unsigned integers
    and integer vectors, sampler and image types."

v3: - update spec comment formatting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
015c0b4a34 glsl: allow bindless samplers/images as varying variables
From section 4.3.4 of the ARB_bindless_texture spec:

   "(modify third paragraph of the section to allow sampler and image
    types) ...  Vertex shader inputs can only be float,
    single-precision floating-point scalars, single-precision
    floating-point vectors, matrices, signed and unsigned integers
    and integer vectors, sampler and image types."

From section 4.3.6 of the ARB_bindless_texture spec:

   "Output variables can only be floating-point scalars,
    floating-point vectors, matrices, signed or unsigned integers or
    integer vectors, sampler or image types, or arrays or structures
    of any these."

v3: - add spec comment

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
89e37f9703 glsl: allow input memory qualifiers for images
ARB_bindless_texture spec allows images to be declared as
shader inputs.

v2: - put the */ on the following line (Timothy Arceri)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
242964ca5c glsl: allow image qualifiers inside structures
ARB_bindless_texture allows to declare images inside structures
which means that qualifiers like writeonly should be allowed.

I have a got a confirmation from Jeff Bolz (one author of the spec),
because the spec doesn't clearly explain this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
48b7882200 glsl: allow bindless images to be declared inside structures
The spec doesn't clearly state this, but I have got clarification
from the spec authors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
e1eb30975a glsl: allow bindless samplers/images inside interface blocks
From section 4.3.7 of the ARB_bindless_texture spec:

   "(remove the following bullet from the last list on p. 39, thereby
    permitting sampler types in interface blocks; image types are also
    permitted in blocks by this extension)"

    * sampler types are not allowed

v3: - update the spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
75cc83747e glsl: allow bindless samplers/images as function return
The ARB_bindless_texture spec doesn't clearly state this, but as
it says "Replace Section 4.1.7 (Samplers), p. 25" and,
"Replace Section 4.1.X, (Images)", this should be allowed.

v3: - add spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
cb405f170b glsl: allow bindless samplers/images as out and inout parameters
From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers can be used as l-values, so can be assigned into and used
    as "out" and "inout" function parameters."

From section 4.1.X of the ARB_bindless_texture spec:

   "Images can be used as l-values, so can be assigned into and used as
    "out" and "inout" function parameters."

v3: - add spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
4c084f18fd glsl: allow to declare bindless samplers/images as non-uniform
From section 4.1.7 of the ARB_bindless_texture spec:

   "Samplers may be declared as shader inputs and outputs, as uniform
    variables, as temporary variables, and as function parameters."

From section 4.1.X of the ARB_bindless_texture spec:

   "Images may be declared as shader inputs and outputs, as uniform
    variables, as temporary variables, and as function parameters."

v3: - add validate_storage_for_sampler_image_types()
    - update spec comment
    - update the glsl error message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
115d938cea glsl: process bindless/bound layout qualifiers
This adds bindless_sampler and bound_sampler (and respectively
bindless_image and bound_image) to the parser.

v3: - add an extra space in apply_bindless_qualifier_to_variable()
    - fix indentation in merge_qualifier()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
cf52b8cd21 glsl: do not make sampler/image types readonly variables
In plain GLSL, sampler and image types can only be declared
uniform-qualified global variables or 'in' function parameters.

Setting the read_only flag seems quite useless because other
checks will prevent sampler/image variables to be assigned and
also because the flag is not set for atomic_uint types which are
opaque types.

This will also help for ARB_bindless_texture because samplers
and images can be assigned when they are considered bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
c618f31065 glsl: make sampler/image scalar types
As a side effect, this will magically fix std140/std430 interfaces
for bindless samplers/images and will help for implementing the
explicit conversions with constructors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
33931e4062 glsl: make count_attribute_slots() returns 1 for samplers/images
For packed varyings.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
1f40343e9a glsl: make component_slots() returns 2 for samplers/images
Bindless samplers/images are 64-bit unsigned integers, which
means they consume two components as specified by
ARB_bindless_texture.

It looks like we are not wasting uniform storage by changing
this because default-block uniforms are not packed. So, if
we use N uint uniforms, they occupy N * 16 bytes in the
constant buffer. This is something that could be improved.

Though, count_uniform_size needs to be adjusted to not count
a sampler (or image) twice.

As a side effect, this will probably break the cache if you
have one because it will consider sampler/image types as
two components.

v3: - update the comments

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
becc87b84a glsl: make sampler/image types as 64-bit
The ARB_bindless_texture spec says:

   "Samplers are represented using 64-bit integer handles."

and,

   "Images are represented using 64-bit integer handles."

It seems simpler to always consider sampler and image types
as 64-bit unsigned integer.

This introduces a temporary workaround in _mesa_get_uniform()
because at this point no flag are used to distinguish between
bound and bindless samplers. This is going to be removed in a
separate series. This avoids breaking arb_shader_image_load_store-state.

v3: - update the comment slightly

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
042eee2067 glsl: add ARB_bindless_texture enable
This also adds the extension to the standalone GLSL compiler.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-06 16:40:19 +02:00
Samuel Pitoiset
b08a9bf791 mesa: add ARB_bindless_texture to the extensions list
This is required for the following GLSL bits.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-05-06 16:40:19 +02:00
Fredrik Höglund
5ff4858111 radv/meta: fix restoring a push descriptor set
radv_bind_descriptor_set cannot be used to bind a push descriptor set
since a push descriptor set does not have a buffer list. However,
there is no need to add the buffers again when restoring a set, so
this fix is also an optimization.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-06 01:46:18 +02:00
Nicolas Boichat
f6ac3d0db6 configure.ac: Also match -androideabi tuple
On ARM Android platforms, the host_os tuple should be linux-androideabi,
so let's match both -android and -androideabi (or any other
-android* tuple) to determine if we should do an Android build.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-05 15:39:38 -07:00
Jason Ekstrand
e05e3e07ab anv/allocator: Only write to _vg_ptr if we have valgrind
This fixes the build when not building against valgrind headers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100945
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-05 12:49:51 -07:00
Daniel Stone
d4342b1398 i915: Fix build break with empty unreachable()
Actually put something in unreachable(), so as not to break the build on
a Friday evening.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reported-by: Mark Janes <mark.a.janes@intel.com>
2017-05-05 18:24:44 +01:00
Marek Olšák
ee5908396e radeonsi: apply the tess+GS hang workaround to Polaris12 as well
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 18:55:03 +02:00
Daniel Stone
8b8af19065 i965: Set modifier for imported and duplicated images
When a buffer is being created from FD or GEM flink import, the current
API makes no provision for passing modifier information along with this.
Set the modifier for such images to DRM_FORMAT_MOD_INVALID.

Also preserve the modifier when duplicating an image, as will be done by
GBM when importing from a wl_buffer.

This doubly tripped up Wayland, as the images would first have been
created (as wl_buffers) with a 0 modifier, and then lost what modifier
they would've had when being duplicated into gbm_bos.

Fixes: d78a36ea62 ("i965/dri: Handle the linear fb modifier")
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-05 17:34:10 +01:00
Daniel Stone
467332a0ab i965: Use helper function for modifier -> tiling
Use a helper function and struct to convert between a modifier and
tiling mode, so we can use it later for a tiling -> modifier lookup.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-05 17:34:10 +01:00
Samuel Pitoiset
485ece83ac radeonsi: fix build with GCC 4.8
Fixes: 7088b655e8 ("radeonsi: constify a bunch of the perfcounter structs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100937
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-05 18:29:30 +02:00
Samuel Pitoiset
92ab06e782 st/glsl_to_tgsi: fix renumber_registers() in presence of dead code
The TGSI DCE pass doesn't eliminate dead assignments like
MOV TEMP[0], TEMP[1] in presence of loops because it assumes
that the visitor doesn't emit dead code. This assumption is
actually wrong and this situation happens.

However, it appears that the merge_registers() pass accidentally
takes care of this for some weird reasons. But since this pass has
been disabled for RadeonSI and Nouveau, the renumber_registers()
pass which is called *after*, can't do its job correctly.

This is because it assumes that no dead code is present. But if
there is still a dead assignment, it might re-use the TEMP
register id incorrectly and emits wrong code.

This patches fixes the issue by recording writes instead of reads,
and this has the advantage to be faster.

This should fix Unigine Heaven on RadeonSI and Nouveau.

shader-db results with RadeonSI:

47109 shaders in 29632 tests
Totals:
SGPRS: 1923308 -> 1923316 (0.00 %)
VGPRS: 1133843 -> 1133847 (0.00 %)
Spilled SGPRs: 2516 -> 2518 (0.08 %)
Spilled VGPRs: 65 -> 65 (0.00 %)
Private memory VGPRs: 1184 -> 1184 (0.00 %)
Scratch size: 1308 -> 1308 (0.00 %) dwords per thread
Code Size: 60095968 -> 60096256 (0.00 %) bytes
LDS: 1077 -> 1077 (0.00 %) blocks
Max Waves: 431889 -> 431889 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

It's still interesting to disable the merge_registers() pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 09:48:01 +02:00
Iago Toral Quiroga
7761cf6d01 anv/query: handle more cases of 'out of host memory'
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-05 08:53:33 +02:00
Nicolas Boichat
63b12b0c77 egl/android: Set EGLSurface.Lost to EGL_TRUE/EGL_FALSE
Lost is an EGLBoolean, so we should assign it to EGL_TRUE/EGL_FALSE,
not true/false.

Fixes: e5eace5868 ("egl/android: Mark surface as lost when dequeueBuffer fails")
Fixes: 0212db3504 ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor")
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-04 20:09:10 -07:00
Jason Ekstrand
98cd512089 anv/allocator: Improve block pool growing asserts
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
24827fdf50 anv: Drop the instruction pool block size
Now that we can allocate states larger than the block size, we no longer
need a block size of 1MB which can be rather wasteful.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
955127db93 anv/allocator: Add support for large stream allocations
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
f82d3d38b6 anv/allocator: Allow state pools to allocate large states
Previously, the maximum size of a state that could be allocated from a
state pool was a block.  However, this has caused us various issues
particularly with shaders which are potentially very large.  We've also
hit issues with render passes with a large number of attachments when we
go to allocate the block of surface state.  This effectively removes the
restriction on the maximum size of a single state.  (There's still a
limit of 1MB imposed by a fixed-length bucket array.)

For states larger than the block size, we just grab a large block off of
the block pool rather than sub-allocating.  When we go to allocate some
chunk of state and the current bucket does not have state, we try to
pull a chunk from some larger bucket and split it up.  This should
improve memory usage if a client occasionally allocates a large block of
state.

This commit is inspired by some similar work done by Juan A. Suarez
Romero <jasuarez@igalia.com>.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
8c079b566e anv/allocator: Support pushing multiple blocks onto a free list at once
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
8769fb48fb anv/allocator: Add helpers for dealing with bucket sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
12043ca696 anv/allocator: Add the capability to allocate blocks of different sizes
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
01170df262 anv/allocator: Rework a comment
This commit just fixes up the English a bit and re-flows the comment.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
bcc5d0defb anv/allocator: Tweak the block pool growing algorithm
The old algorithm worked fine assuming a constant block size.  We're
about to break that assumption so we need an algorithm that's a bit more
robust against suddenly growing by a huge amount compared to the
currently allocated quantity of memory.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
d3ed72e2c2 anv/allocator: Embed the block_pool in the state_pool
Now that the state stream is allocating off of the state pool, there's
no reason why we need the block pool to be separate.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
bb2a3f0df8 anv/allocator: Get rid of the ability to free blocks
Now that everything is going through the state pools, the block pool no
longer needs to be able to handle re-use.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
08413a81b9 anv: Allocate binding table blocks through the state pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
55f49e6b7e anv/allocator: Add support for "back" allocations to state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
49ecaf88d1 anv/allocator: Drop the block_size field from block_pool
Since the state_stream is now pulling from a state_pool, the only thing
pulling directly off the block pool is the state pool so we can just
move the block_size there.  The one exception is when we allocate
binding tables but we can just reference the state pool there as well.

The only functional change here is that we no longer grow the block pool
immediately upon creation so no BO gets allocated until our first state
allocation.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
30d63ffe26 anv/allocator: Pull the userptr part of block_pool_grow into a helper
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
c73ce41a48 anv/allocator: Roll fixed_size_state_pool into state_pool
The helper functions aren't really gaining us as much as they claim and
are actually about to be in the way.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
6d02ef011e anv/allocator: Remove the state_size field from fixed_size_state_pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
367031a5c8 anv: Get rid of a bunch of uses of size_t
We should only use size_t when referring to sizes of bits of CPU memory.
Anything on the GPU or just a regular array length should be a type that
has the same size on both 32 and 64-bit architectures.  For state
objects, we use a uint32_t because we'll never allocate a piece of
driver-internal GPU state larger than 2GB (more like 16KB).

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
e86aeecb6a anv/allocator: Convert the state stream to pull from a state pool
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
e049dea5b2 anv/allocator: Return a null state for zero-size allocations
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Jason Ekstrand
45e1829274 anv/allocator: Add no-valgrind versions of state_pool_alloc/free
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-05-04 19:07:54 -07:00
Dave Airlie
a096d8d3f7 radv: enable POLARIS12 support.
This just adds the chip in the right places.

We don't set the partial_vs_wave workaround, as radeonsi
doesn't, but have to confirm it's not required.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-05 11:07:40 +10:00
Chad Versace
e5eace5868 egl/android: Mark surface as lost when dequeueBuffer fails
This ensures that future calls to eglSwapBuffers and eglMakeCurrent emit
an error.

This patch is part of a series for fixing
android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface
on Chrome OS x86 devices.

Cc: mesa-stable@lists.freedesktop.org
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 17:46:34 -07:00
Chad Versace
0212db3504 egl/android: Cancel any outstanding ANativeBuffer in surface destructor
That is, call ANativeWindow::cancelBuffer in droid_destroy_surface().

This should prevent application deadlock when the app destroys the
EGLSurface after EGL has acquired a buffer from SurfaceFlinger
(ANativeWindow::dequeueBuffer) but before EGL has released it
(ANativeWindow::enqueueBuffer).

This patch is part of a series for fixing
android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface
on Chrome OS x86 devices.

Cc: mesa-stable@lists.freedesktop.org
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 17:46:33 -07:00
Chad Versace
23c86c74cc egl: Emit error when EGLSurface is lost
Add a new bool, _EGLSurface::Lost, and check it in eglMakeCurrent and
eglSwapBuffers. The EGL 1.5 spec says that those functions emit errors
when the native surface is no longer valid.

This patch just updates core EGL. No driver sets _EGLSurface::Lost yet.

I discovered that Mesa failed to detect lost surfaces while debugging an
Android CTS camera test,
android.hardware.camera2.cts.RobustnessTest#testAbandonRepeatingRequestSurface.
This patch doesn't fix the test though, though, because the test expects
EGL_BAD_SURFACE when the surface becomes lost, and this patch actually
complies with the EGL spec. If I interpreted the EGL spec correctly,
EGL_BAD_NATIVE_WINDOW or EGL_BAD_CURRENT_SURFACE is the correct error.

Cc: mesa-stable@lists.freedesktop.org
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nicolas Boichat <drinkcat@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 17:46:33 -07:00
Marek Olšák
69e6eab653 winsys/amdgpu: fix Polaris12 (RX 550) breakage
reported by Greg White.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100892
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
2017-05-05 01:21:32 +02:00
Kenneth Graunke
9377801fbd anv: Simplify Cherryview line handling.
We can just use the new CHVLineWidth field rather than an entirely
different generation's packing function.

v2: Inline the function (requested by Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 16:17:34 -07:00
Kenneth Graunke
31f094e691 i965: Fix line width on Cherryview.
We just add another field to gen8.xml for the Cherryview line width,
rather than trying to replicate the gymnastics done in the Vulkan
driver to use gen9 SF pack functions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 16:17:34 -07:00
Marek Olšák
194d9b27cc radeonsi/gfx9: allow the scratch buffer in HS and GS
It works now.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
8ac4923a67 radeonsi: prevent race conditions when doing scratch patching
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
9dfc030b48 radeonsi: separate scratch state patching code into its own function
Picked from a different branch. When we stop using the scratch patching,
this function will not be called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
1b01014cbf radeonsi/gfx9: also apply scratch relocations to the 1st shader of merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
e107c5a426 radeonsi/gfx9: set correct LLVM calling conventions for merged shaders
for scratch support

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
a47289f8fc radeonsi: remove unused parameters from si_shader_apply_scratch_relocs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
2d662c0cba radeonsi: inline si_llvm_shader_type into si_llvm_create_func
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
4c0e68dfe5 radeonsi: don't use util_memcpy_cpu_to_le32 for shader uploads
at least I think this is correct.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
7660c9ee4e radeonsi: make si_compile_llvm static
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
f8f8242e8b radeonsi: fold surrounding code into si_llvm_finalize_module
and rename to si_llvm_optimize_module.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
5dad0c3477 radeonsi: don't call eliminate_const_vs_outputs in shaders without VS exports
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
12beef0374 radeonsi: drop support for LLVM 3.8
LLVM 3.8:
- had broken indirect resource indexing
- didn't have scratch coalescing
- was the last user of problematic v16i8
- only supported OpenGL 4.1

This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
4d32b4ac99 radeonsi: stop using v16i8
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Marek Olšák
283a1d1e27 radeonsi/gfx9: make some PA & DB registers match the closed Vulkan driver
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-05 00:23:44 +02:00
Dave Airlie
efa19f5a54 radv: don't advertise transfer props unless we can do anything else
There is no reason to advertise transfer ability for formats we can't
use for anything else. This stops some CTS tests hitting internal
error for 64-bit types when they see the transfer flags.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-05 05:46:02 +10:00
Rob Clark
7b55a05159 freedreno/a5xx: compute shader support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
10c17f23b7 freedreno: core compute state support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
2ce449fa7d freedreno/ir3: compute shader support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
39c5a46a7a freedreno/a5xx: SSBO support
To simplify things for now, since all the gfx shader stages share a
single SSBO state block, only advertise SSBO support for fragment shader
(and compute when we have that).  We could possibly use a fixed-
partitioning of the SSBO index space to support SSBOs on other stages
without having to resort to shader variants.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
edde00f5f1 freedreno/ir3: SSBO/atomic support
TODO cwabbott pointed out a write-after-read hazzard, which effects both
this and arrays.  A write needs to depend on *all* reads since the last
write, not just the last read.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
4d841fbaae freedreno: core SSBO support
The generation-independent support for binding shader buffer objects.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
fd6ed7b562 freedreno/ir3: resync instr-a3xx.h/disasm-a3xx.c
Sync to the same files from freedreno.git to correct decoding of ldgb/
stgb instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Rob Clark
5f7e55582e mesa/st: compute support for glsl_to_nir
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-04 13:48:06 -04:00
Rob Clark
53aa109ba2 nir: add pass to lower atomic counters to SSBO
This is equivalent to what mesa/st does in glsl_to_tgsi.  For most hw
there isn't a particularly good reason to treat these differently.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 13:48:06 -04:00
Rob Clark
fd500cc10b nir: add a C wrapper for glsl_type::get_interface_instance()
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Emil Velikov
d230ef842c mapi_abi.py: remove no longer used --mode option
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-04 18:17:06 +01:00
Emil Velikov
3698fe295a mapy_abi.py: remove dead output_for_app generator
Used by the OpenVG codebase.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 18:17:06 +01:00
Emil Velikov
4562d88c1d mapi: replace mapi_table abstraction
Replace all instances of mapi_table with the actual struct _glapi_table.
The former may have been needed when the OpenVG was around. But since
that one is long gone, there' no point in having the current confusing
mix of the two.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-04 18:17:03 +01:00
Emil Velikov
424cb9d3ea mesa/tests: remove no longer needed HAVE_SHARED_GLAPI define
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:12:11 +01:00
Emil Velikov
edb7165b25 gl_table.py: always regenerate the complete struct _glapi_table
Currently we would generate a partial one as we do non-shared glapi.
At the same time since it's local, we don't care that much if we have a
few extra bytes of space in the table.

Drop the guard, which allows us to simplify both build system and code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:12:07 +01:00
Emil Velikov
6d6913ba5a glx/apple: remove empty variable SHARED_GLAPI_CFLAGS
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:12:05 +01:00
Emil Velikov
94d48864ea glx/windows: remove empty variable SHARED_GLAPI_CFLAGS
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:12:02 +01:00
Emil Velikov
27a4fd5047 glx: automake: scons: remove unneeded GLX_SHARED_GLAPI define
There's no users in-tree that use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:59 +01:00
Emil Velikov
4752ae876a targets/libgl-xlib: remove unneeded GLX_SHARED_GLAPI define
There's no users in-tree that use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:56 +01:00
Emil Velikov
f885f1ae14 drivers/x11: remove unneeded GLX_SHARED_GLAPI define
There's no users in-tree that use it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:53 +01:00
Emil Velikov
6177d60a37 glx: glX_proto_send.py: use correct compile guard GLX_INDIRECT_RENDERING
The code itself has nothing to do with shared glapi, thus having it
behind GLX_SHARED_GLAPI is misleading. Use GLX_INDIRECT_RENDERING
instead.

The latter macro is set at global scope by the Autotools and Scons build
systems.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:50 +01:00
Emil Velikov
123c1f69c0 mapi/es*api: remove unneeded HAVE_SHARED_GLAPI guard
Always true, since GLES* requires shared glapi.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:47 +01:00
Emil Velikov
3186d33ae9 mesa/dri: remove unneeded HAVE_SHARED_GLAPI guard
Always true, since the dri modules required shared glapi.

With earlier commit (da410e6afa "configure: explicitly require shared
glapi for enable-dri") we even made that explicit during the configure
stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:44 +01:00
Emil Velikov
4ea5f4b74c gallium/dri: remove unneeded HAVE_SHARED_GLAPI guard
Always true, since the dri modules required shared glapi.

With earlier commit (da410e6afa "configure: explicitly require shared
glapi for enable-dri") we even made that explicit during the configure
stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:40 +01:00
Emil Velikov
51accecce7 mesa/dri: always link against shared glapi
Analogous to previous commit. Check with the extensive commit
description and bug report referenced.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:37 +01:00
Emil Velikov
79a26b663a gallium/dri: always link against shared glapi
In the early days of Xorg and Mesa we had multiple providers of the
GLAPI. All of those were the ones responsible for dlopening the DRI
module. Hence it was perfectly fine, and actually expected, for the DRI
modules to have unresolved symbols.

Since then we've moved the API to a separate shared library and no other
libraries provide the symbols.

Here comes the picky part:
It's possible that one uses old Xorg (where libglx.so provides the
GLAPI) and new Mesa (with DRI modules linking against libglapi.so).

That should still work, since the the libglx.so symbols will take
precedence over the libglapi.so ones.

I've verified this while running 1.14 series Xorg alongside this (and
next) patch.

It may seem a bit fragile, but that's of reasonably OK since all of the
affected Xorg versions have been EOL for years.

The final one being the 1.14 series, which saw its final bug fix release
1.14.7 in June 2014.

To ensure that the binaries do not have unresolved symbols add
-no-undefined and $(LD_NO_UNDEFINED), just like we do everywhere else
throughout mesa.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98428
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 18:11:29 +01:00
Emil Velikov
9d2aa6e506 anv: fix anv_gem_mmap comment to not mention NULL
The function cannot return NULL, update the comment accordingly.

Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-05-04 18:06:18 +01:00
Emil Velikov
b6643095ba eg: explicitly size dri2_to_egl_attribute_map[]
This way we'll get an implicit zero initialization of the remaining
members, as required by dri2_add_config().

Fixes: e5efaeb85c ("egl: polish dri2_to_egl_attribute_map[]")
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 18:05:47 +01:00
Emil Velikov
2c60bf093e dri_interface.h: define __DRI_ATTRIB_MAX
Thus we can use the value to explicitly size arrays, instead of
__DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE + 1.

The latter seems magical and is error prone, as we add more dri
attributes.

v2: Fix off by one error (Tomasz)

Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 18:05:25 +01:00
Ben Boeckel
58f51f0754 scons: update for LLVM 4.0
LLVMDemangle, LLVMGlobalISel, and LLVMDebugInfoMSF are new.

Also update the comment to add irreader to the list of components.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chuck Atkins <chuck.atkins@kitware.com>
Signed-off-by: Ben Boeckel <ben.boeckel@kitware.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-05-04 18:05:04 +01:00
Emil Velikov
c9c8e1c84d c11/threads: rework Windows thrd_current() comment
Drop the misleading "will not match the one returned by thread_create"
hunk and provide more clarity as to what/why GetCurrentThread() isn't
the solution we're looking for.

v2: Places brackets after function names (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-05-04 18:00:23 +01:00
Adam Jackson
f258815c7d egl/platform/drm: Don't take display ownership until gbm is initialized
If the gbm_create_device() call here actually did fail, any subsequent
eglTerminate on the display would segfault.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-05-04 12:52:18 -04:00
Adam Jackson
ddb99127a6 egl/x11: Honor the EGL_PLATFORM_X11_SCREEN_EXT attribute
Introduce _egl_display::Options::Platforms for private storage.
For X11 platforms we can use it for the screen number as set by
EGL_PLATFORM_X11_SCREEN_EXT.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-05-04 12:52:18 -04:00
Samuel Iglesias Gonsálvez
939b015736 anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST,DEVICE}_MEMORY on failure
According to the spec we get VK_ERROR_OUT_OF_HOST_MEMORY or
VK_ERROR_OUT_OF_DEVICE_MEMORY on vkBindImageMemory failure.

Fixes returned value changed by b546c9d.

Fixes: b546c9d ("anv: anv_gem_mmap() returns MAP_FAILED as mapping error")
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-05-04 15:13:08 +02:00
Samuel Pitoiset
9db9b2e8cd glsl: reject memory qualifiers with uniform blocks
The spec allows memory qualifiers to be used with image variables,
buffers variables and shader storage blocks. This patch also fixes
validate_memory_qualifier_for_type().

Fixes the following ARB_uniform_buffer_object test:

uniform-block-memory-qualifier.frag

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 14:01:59 +02:00
Samuel Pitoiset
f8003d2516 glsl: reject format qualifiers with non-image types everywhere
Including structures, interfaces and uniform blocks.

Fixes the following ARB_shader_image_load_store test:

format-layout-with-non-image-type.frag

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 14:01:56 +02:00
Samuel Pitoiset
9efea874b9 glsl: rework validate_image_qualifier_for_type()
It makes more sense to have two separate validate functions,
mainly because memory qualifiers are allowed with members of
shader storage blocks.

validate_memory_qualifier_for_type() will be fixed in a
separate patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 14:01:47 +02:00
Samuel Pitoiset
a5f82db380 glsl: rename image_* qualifiers to memory_*
It doesn't make sense to prefix them with 'image' because
they are called "Memory Qualifiers" and they can be applied
to members of storage buffer blocks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-05-04 09:51:25 +02:00
Samuel Iglesias Gonsálvez
b546c9d318 anv: anv_gem_mmap() returns MAP_FAILED as mapping error
Take it into account when checking if the mapping failed.

v2:
- Remove map == NULL and its related comment (Emil)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

Fixes: 6f3e3c715a ("vk/allocator: Add a BO pool")
Fixes: 9919a2d34d ("anv/image: Memset hiz surfaces to 0 when binding memory")
Cc: "17.0 17.1" <mesa-stable@lists.freedesktop.org>
2017-05-04 08:56:36 +02:00
Johnson Lin
a6fb943f3e nir/lower_tex: Fix minor error in YUV color conversion matrix
The matrix used for YCbCr to RGB is listed in:

    https://en.wikipedia.org/wiki/YCbCr

There was an error in converting the offsets from integers to unorm
values: 0.0625=16/256 should be 16.0/255,and 0.5=128.0/256 should be
128.0/255.  With this fix, the CSC result is bit aligned with wikipedia's
conversion result and FFMPeg's result.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-05-03 23:44:59 -07:00
Rafael Antognolli
da665d22f5 i965: Port gen4+ state emitting code to genxml.
On this patch, we port:
   - brw_polygon_stipple
   - brw_polygon_stipple_offset
   - brw_line_stipple
   - brw_drawing_rect

v2:
   - Also emit states for gen4-5 with this code.
v3:
   - Style fixes and remove excessive checks (Ken).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 20:40:20 -07:00
Rafael Antognolli
c85b217ab0 i965: Port gen6+ 3DSTATE_CC_STATE_POINTERS state to genxml.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 20:40:09 -07:00
Rafael Antognolli
b47b845574 i965: Port gen6+ multisample state emitting code to genxml.
Emit 3DSTATE_MULTISAMPLE using brw_batch_emit.

v3:
   - Remove dead code (Ken)
   - Simplify #if/#endif (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 20:40:03 -07:00
Rafael Antognolli
158dcd8659 i965: Port gen4+ emit vertices code to genxml.
Some code that was placed in brw_draw_upload.c and exported to be used
by gen8+ was also moved to genX_state_upload, and the respective symbols
are not exported anymore.

v2:
   - Remove code from brw_draw_upload too
   - Emit vertices for gen4-5 too.
   - Use helper to setup brw_address (Kristian)
   - Use macros for MOCS values.
   - Do not use #ifndef NDEBUG on code that is actually used (Ken)
v3:
   - Style and code clenup (Ken)
   - Keep some of the common code inside brw_draw_upload.c (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 20:39:48 -07:00
Rafael Antognolli
46d8f9454f i965: Port push constant code to genxml.
The following states are ported on this patch:
   - gen6_gs_push_constants
   - gen6_vs_push_constants
   - gen6_wm_push_constants
   - gen7_tes_push_constants

v2:
   - Use helper to setup brw_address (Kristian)
v3:
   - Do not use macro for upload_constant_state (Ken)
   - Do not re-declare MOCS macro (Ken)
v4: (by Ken)
   - Drop more dead code, change brw->gen checks to GEN_GEN, style nits

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 20:38:20 -07:00
Rafael Antognolli
d729936c5e i965: Port gen6+ 3DSTATE_SCISSOR_STATE_POINTERS to use genxml.
Emit 3DSTATE_SCISSOR_STATE_POINTERS using brw_batch_emit, and pack the
scissor states using GENX(SCISSOR_RECT_pack), generated from genxml.

v3:
   - Remove old code (Ken)
   - Style fixes (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:52 -07:00
Rafael Antognolli
c5ddd4782c i965: Port gen7+ 3DSTATE_TE to genxml.
Emit 3DSTATE_TE on Gen7+ using brw_batch_emit helper.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:52 -07:00
Rafael Antognolli
98cce55317 i965: Port gen6+ blend state code to genxml.
Upload blend states using GENX(BLEND_STATE_ENTRY_pack), generated from
genxml.

v3:
   - style fixes (Ken)
   - cleanup to remove excessive #ifdef's (Ken)
   - remove memset (Ken)
   - disable blend.AlphaToCoverageDitherEnable on gen6 (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:52 -07:00
Rafael Antognolli
bc1ff4509d i965: Port gen6+ state emitting code to genxml.
Ported in this patch:
   - 3DSTATE_DS
   - 3DSTATE_GS
   - 3DSTATE_HS
   - 3DSTATE_VIEWPORT_STATE_POINTERS_SF_CL

v3:
   - Remove NEW_TRANSFORM blocks (Ken)
   - Bring back some comments and workaround for Ivybridge (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:52 -07:00
Rafael Antognolli
689a46f30e i965: Port gen6+ 3DSTATE_VS to genxml.
Emit 3DSTATE_VS on Gen6+ using brw_batch_emit helper, that uses pack
structs from genxml.

v2:
   - Use render_bo helper to setup brw_address (Kristian)
v3:
   - Bring back some comments for gen6 and remove _NEW_TRANSFORM blocks
   from gen7+.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
11ee4ac5e5 i965: Port gen8+ 3DSTATE_PS_EXTRA to genxml.
Emit 3DSTATE_PS_EXTRA on Gen8+ using brw_batch_emit helper, that uses
pack structs from genxml.

v3:
   - Style fixes and moving code around to be cleaner (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
46934d9594 i965: Port gen6+ 3DSTATE_WM to genxml.
Emit 3DSTATE_WM on Gen6+ using brw_batch_emit helper, that uses pack
structs from genxml.

v2:
   - Use render_bo helper to setup brw_address (Kristian)
   - Remove TODO and use BRW_PSCDEPTH_OFF.
v3:
   - A couple of style fixes (Ken)
   - Enable RASTRULE_UPPER_RIGHT on gen6+ instead of gen8+ (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
23f69dfc0f i965: Port gen7+ 3DSTATE_PS to genxml.
Emit 3DSTATE_PS on Gen7+ using brw_batch_emit helper, that uses pack
structs from genxml.

v2:
   - Use render_bo helper to setup brw_address (Kristian)
v3:
   - Style fixes and code cleanup (Ken)
v4:
   - More style fixes and code cleanup missed in v3

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
ddc6f4d069 i965: Port gen7+ 3DSTATE_SOL to genxml.
Emit 3DSTATE_SOL on Gen7+ using brw_batch_emit helper, that uses pack
structs from genxml.

v2:
   - Add helpers to assign struct brw_address (Kristian)
v3:
   - Rename MOCS -> SOBufferMOCS
   - Do not re-declare MOCS macros (Ken).
   - Style and code reorganization (Ken).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
c5d6ee6ccb i965: Remove calculate_attr_overrides.
This function now lives inside genX_state_upload.c.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
072bcb8edc i965: Port Gen7+ 3DSTATE_SBE state to genxml.
Emit 3DSTATE_SBE on Gen7+ using brw_batch_emit helper, that uses pack
structs from genxml.

v2: - Use ACTIVE_COMPONENT_XYZW from gen9.xml.
v3: - Style fixes (Ken)
v4: #undef unconditionally (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
9f12d9166b i965: Port gen6+ 3DSTATE_SF to genxml.
Emit sf state on Gen6+ using brw_batch_emit helper, using pack structs
from genxml.

v3:
   - Reorganize code and reduce #if/#endif's (Ken)
   - Style fixes (Ken)
   - Always set AALINEDISTANCE_TRUE (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
e00d159f4d i965: Add brw_get_line_width_float.
That helper function returns the line width as a float, and is then used
by brw_get_line_width to return the fixed point width.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
13ac46557a i965: Port Gen8+ 3DSTATE_RASTER state to genxml.
Emits 3DSTATE_RASTER from genX_state_upload.c using pack structs from
genxml.

v3:
   - Style fixes (Ken)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
36c02ce448 i965: Port Gen6+ 3DSTATE_CLIP state to genxml.
Emit clip state on Gen6+ using brw_batch_emit helper, using pack structs
from genxml.

v3:
   - Lots style fixes (Ken)
   - Do not set CullTestEnableBitMask on Gen8+ (Ken)
v4:
   - Do not include brw_defines_common.h.
v5 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Kenneth Graunke
dae5cc79c6 i965: Port Gen6+ DEPTH_STENCIL state to genxml.
This emits 3DSTATE_WM_DEPTH_STENCIL on Gen8+ or DEPTH_STENCIL_STATE
(and the relevant pointer packets) on Gen6-7.5 from a single function.

v3:
   - Watch for BRW_NEW_BATCH too on gen < 8 (Ken)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Kenneth Graunke
5a19d0bcec i965: Get real per-gen atom lists
Make atoms initalization compile conditionally based on the target
platform.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-03 18:57:51 -07:00
Kenneth Graunke
9afb98c429 i965: Add genxml related plumbing in a new genX_state_upload.c file.
v3 (Rafael): Drop aub parameter
v4 (Ken): Squash in gen4/g45 automake fixes

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-03 18:57:51 -07:00
Kenneth Graunke
3ae99de2e8 i965: Drop "Destination Element Offset" from Ironlake SGVs.
The Ironlake documentation is terrible, so it's unclear whether or not
this field exists there.  It definitely doesn't exist on Sandybridge
and later.  It definitely does exist on G45.

We haven't been setting it for our normal vertex attributes - just
the SGVs (VertexID, InstanceID, BaseVertex, BaseInstance, DrawID).
We should be consistent.  My guess is that it isn't necessary and
doesn't exist - this patch drops it from the SGVs elements, making
them follow the behavior of most attributes.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-03 18:57:51 -07:00
Rafael Antognolli
f321f695d3 genxml: Fix 3DSTATE_DEPTH_BUFFER length on gen5.
The hardware docs are wrong, but the length used in the xml is also
wrong.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 18:57:51 -07:00
Dave Airlie
7088b655e8 radeonsi: constify a bunch of the perfcounter structs.
This moves the structs from the data segment to the rodata segment,
which seems like the more correct place for them.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-04 11:52:47 +10:00
Timothy Arceri
ad282c0b9e st/glsl_to_tgsi: remove unrequired tgsi_get_opcode_info() call
This is already set for the instruction at initialisation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-04 11:42:34 +10:00
Timothy Arceri
e2f3007665 mesa: make _mesa_accum() static
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-04 11:35:37 +10:00
Timothy Arceri
b549f054a6 mesa: tidy up accum.h
These were unused.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-05-04 11:35:37 +10:00
Timothy Arceri
e473fdcdab mesa/varray: make use of dispatch KHR_no_error support
Make use of dispatch KHR_no_error support for varray functions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 11:35:37 +10:00
Timothy Arceri
2f541f63ea glapi: add KHR_no_error support to dispatch table generation
This will allows us to create no error versions of functions
noted by a _no_error suffix. We also need to set a no_error
attribute equal to "true" in the xml.

V3: stop the no_error attribute being overwritten when functions
    alias another.
V2: tidy up suggested by Nicolai.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-04 11:35:36 +10:00
Bas Nieuwenhuizen
33ad6226a0 radv: Don't use FLAT_SHADE for constants.
Setting both offset to 0x20 and flat shade results in passthrough
mode instead of the constant.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: f205e19e4f "radv/ac: eliminate unused vertex shader outputs. (v2)"
2017-05-04 10:38:14 +10:00
Rafael Antognolli
91ab1ccbfe i965: Move MOCS macros to brw_context.h.
These macros are defined in brw_defines.h, which contains a lot of
macros that conflict with autogenerated code from genxml. But we need to
use them (the MOCS macros) in some of that same genxml code.

Moving them to brw_context.h solves that problem and we don't have to
include brw_defines.h.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:59:22 -07:00
Rafael Antognolli
2e5d65ccb6 anv: Use BRW_BARYCENTRIC_NONPERSPECTIVE_BITS from common header.
In a previous patch some enums were split out from brw_eu_defines.h, so
they could be used by genxml based code. anv can also benefit from this.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:58:55 -07:00
Rafael Antognolli
8fa8abef4b i965: Move enums to brw_compiler.h.
These enums live inside struct brw_wm_prog_data, so it makes sense to
keep them in the same header. It also allows to use them without
including brw_eu_defines.h.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:55:58 -07:00
Rafael Antognolli
a66743ce8d genxml: Update 3DSTATE_LINE_STIPPLE xml on gen6.
From the PRM, Line Stipple Inverse Repeat Count is on dw2, bits 31:16,
format U1.13.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:14 -07:00
Rafael Antognolli
7d5cc5b954 genxml: Normalize xml for 3DSTATE_CC_STATE_POINTERS.
- "COLOR_CALC_STATE Change" -> "Color Calc State Pointer Valid"
   - "Pointer to COLOR_CALC_STATE" -> "Color Calc State Pointer"
   - "BackFace" -> "Backface"

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:07 -07:00
Rafael Antognolli
b89805a7bc genxml: Normalize xml for 3DSTATE_MULTISAMPLE.
Name the options to "Pixel Location":
   - PIXLOC_CENTER -> CENTER
   - PIXLOC_UL_CORNER -> UL_CORNER

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:07 -07:00
Rafael Antognolli
c032cae9ff genxml: Rename "Function Enable" to "Enable".
Rename that field name on genxml for:
   - 3DSTATE_GS - gen6+
   - 3DSTATE_DS - gen7+
   - 3DSTATE_HS - gen7+

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:07 -07:00
Rafael Antognolli
5b4223dc8e genxml: Clip guardbands are float, not int.
This makes genxml create the right struct types, and generate the right
batch commands.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:07 -07:00
Rafael Antognolli
4266c372d9 genxml: 3DSTATE_VS rename Function Enable to Enable.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:07 -07:00
Kenneth Graunke
da299b7df3 genxml: Make "Reorder Mode" fields consistent.
Both GS and SOL have these fields.  Some were ReorderEnable = true,
some were ReorderMode = REORDER_TRAILING, and some were just TRAILING.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-05-03 16:41:07 -07:00
Rafael Antognolli
872ffb2221 genxml: Add alias for MOCS.
Use an alias, so we can set the same value as the #define's.

v3:
   - Call it "SO Buffer MOCS" to follow the most common naming scheme.
   - Add alias for gen7 and gen75 too (Ken).

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:02 -07:00
Rafael Antognolli
b5e652fc83 genxml: Add missing field values to 3DSTATE_SBE.
Fill out "Attribute Active Component Format" possible values.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:02 -07:00
Rafael Antognolli
273a10b3f1 genxml: Update xml for 3DSTATE_SF.
- Normalize "Anti-Aliasing Enable"
 - Add "Multisample Rasterization Mode" constants
 - Rename "Use Point Width on Vertex" to "Vertex"
 - Rename "Use Point Width from State" to "State"

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:02 -07:00
Rafael Antognolli
3f155ab290 genxml: Rename clip enable property.
There are two variants:
   - Clip Enable
   - CLIP Enable (on gen6)

Rename everything to Clip Enable.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:41:02 -07:00
Louis-Francis Ratté-Boulianne
e0aa2bd9cb genxml: Fill out Gen4, Gen45 and Gen5 XML
Add some more details to Gen4 and Gen45 and add what is needed
in Gen5 XML. This commit overwrite the previous work done on Gen4
and Gen45 as it contains more instructions and fixes some mistakes.
However, comments (dword boundaries) are lost in the process.

v3:
   - Set the type of some fields, instead of prefix. Also fix the
     SAMPLER_BORDER_COLOR_STATE fields of gen5.xml.

Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-05-03 16:40:52 -07:00
Jason Ekstrand
4201cc2dd3 anv: Implement VK_KHX_external_semaphore_fd
This implementation allocates a 4k BO for each semaphore that can be
exported using OPAQUE_FD and uses the kernel's already-existing
synchronization mechanism on BOs.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
ef2e427d78 anv: Pull the guts of cmd_buffer_execbuf into a helper
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
975c0f339f anv: Implement VK_KHX_external_semaphore
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
298e054d0c anv: Implement VK_KHX_external_semaphore_capabilities
This just stubs things out.  Real external semaphore support will come
with VK_KHX_external_semaphore_fd.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Jason Ekstrand
65aa89e75f anv: Add a real semaphore struct
It's just a dummy for now, but we'll flesh it out as needed for external
semaphores.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-05-03 15:09:46 -07:00
Marek Olšák
f466683cb0 radeonsi/gfx9: fix gl_ViewportIndex
v2: remove unnecessary LLVMBuildAnd calls

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Marek Olšák
ec34632859 radeonsi/gfx9: set VGT_REUSE_OFF = 0
same as Vulkan

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 22:58:27 +02:00
Christian Gmeiner
a8007ed687 etnaviv: add L8A8_UNORM texture format
No piglit regressions.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
2017-05-03 22:43:10 +02:00
Andres Gomez
e4ae4d2789 glsl: Corrected some typos and error messages
v2: left code style/formatting corrections out.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-05-03 23:18:00 +03:00
Grazvydas Ignotas
8aab792e92 radv: don't leak DRM devices
After successful drmGetDevices2() call, drmFreeDevices() needs to be called.

Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 22:04:52 +03:00
Grazvydas Ignotas
898cbb491b radv: fix possible stack corruption
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.

Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 22:02:45 +03:00
Marek Olšák
b08715499e ac: eliminate duplicated VS exports
Only very few shaders have them (from 48486 shaders):

shaders/private/left_4_dead_2/765.shader_test - ac: 1 matches 2
shaders/private/left_4_dead_2/877.shader_test - ac: 1 matches 6
shaders/private/left_4_dead_2/2141.shader_test - ac: 1 matches 6
shaders/private/ue4_effects_cave/11.shader_test - ac: 4 matches 5
shaders/private/ue4_effects_cave/14.shader_test - ac: 5 matches 6
shaders/private/ue4_effects_cave/46.shader_test - ac: 5 matches 6
shaders/private/ue4_effects_cave/42.shader_test - ac: 4 matches 5
shaders/private/ue4_effects_cave/104.shader_test - ac: 4 matches 5
shaders/private/f1-2015/336.shader_test - ac: 3 matches 4
shaders/private/f1-2015/948.shader_test - ac: 6 matches 7
shaders/private/f1-2015/602.shader_test - ac: 0 matches 3
shaders/private/f1-2015/600.shader_test - ac: 0 matches 3
shaders/private/f1-2015/1214.shader_test - ac: 0 matches 1
shaders/private/f1-2015/988.shader_test - ac: 4 matches 5
shaders/private/ue4_elemental/149.shader_test - ac: 3 matches 4
shaders/private/ue4_elemental/346.shader_test - ac: 4 matches 5
shaders/private/ue4_elemental/178.shader_test - ac: 3 matches 4
shaders/private/ue4_elemental/136.shader_test - ac: 4 matches 5
shaders/private/ue4_elemental/168.shader_test - ac: 4 matches 5
shaders/private/ue4_elemental/690.shader_test - ac: 3 matches 4
shaders/private/ue4_elemental/19.shader_test - ac: 5 matches 6
shaders/private/dota2/1901.shader_test - ac: 0 matches 5
shaders/private/dota2/1357.shader_test - ac: 0 matches 5
shaders/private/dota2/1375.shader_test - ac: 0 matches 5
shaders/private/dota2/1369.shader_test - ac: 0 matches 5
shaders/private/dota2/1583.shader_test - ac: 0 matches 5
shaders/private/dota2/1811.shader_test - ac: 0 matches 5
shaders/private/dota2/1893.shader_test - ac: 0 matches 5
shaders/private/dota2/1533.shader_test - ac: 0 matches 5
shaders/private/dota2/1951.shader_test - ac: 0 matches 5
shaders/private/dota2/1361.shader_test - ac: 0 matches 5
shaders/private/mad_max/2792.shader_test - ac: 0 matches 1
shaders/private/mad_max/2794.shader_test - ac: 0 matches 1
shaders/private/mad_max/2780.shader_test - ac: 0 matches 1
shaders/private/mad_max/2902.shader_test - ac: 0 matches 1
shaders/private/bioshock-infinite/3050.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/2544.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/3062.shader_test - ac: 3 matches 8
shaders/private/bioshock-infinite/2012.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/3058.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/3270.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/732.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/3026.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/3258.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/3198.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/3046.shader_test - ac: 3 matches 7
shaders/private/bioshock-infinite/3168.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/2550.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/3210.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/3032.shader_test - ac: 3 matches 6
shaders/private/bioshock-infinite/668.shader_test - ac: 3 matches 7

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 20:55:00 +02:00
Marek Olšák
7647e90b15 ac: rename ac_eliminate_const_vs_outputs -> ac_optimize_vs_outputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 20:55:00 +02:00
Marek Olšák
faa37475e9 ac: first parse VS exports before eliminating constant ones
A later commit will make use of this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 20:55:00 +02:00
Jason Ekstrand
f8d7c23e1f anv: Trivially implement multiDrawIndirect
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
272b7e7d25 anv: Enable VK_KHX_multiview and SPV_KHR_multiview
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
3dbd7737d4 anv/cmd_buffer: Emit instanced draws for multiple views
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
32abb0e13c anv/cmd_buffer: Pull indirect draw parameter loading into a helper
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
0db7070330 anv/pipeline: Add shader lowering for multiview
v2 (Jason Ekstrand):
 - Take a view_mask rather than a whole subpass
 - Build the view mask into the VS shader key

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
ca5bdfdfc6 anv/pipeline: Add a subpass field to anv_pipeline
This simplifies the code a variety of places.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
c4549e05aa anv/pipeline: Call nir_gather_info later
We want to insert more lowering code that may insert system values and
we need to gather info after that lowering.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
dcb6a68bb4 anv: Move shader hashing to anv_pipeline
Shader hashing is very closely related to shader compilation.  Putting
them right next to each other in anv_pipeline makes it easier to verify
that we're actually hashing everything we need to be hashing.  The only
real change (other than the order of hashing) is that we now hash in the
shader stage.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
d6b8106eea anv/pass: Store the per-subpass view mask
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
e997f548de anv: Add the KHX_multiview boilerplate
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
0bed97006f anv/nir: Delete the apply_dynamic_offsets prototype
That pass hasn't existed since dd4db84640
but the prototype stuck around for no reason.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
f903f78b72 spirv: Add support for SPV_KHR_multiview
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
99d0709553 spirv: Bump the SPIR-V header to the latest public version
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 11:25:46 -07:00
Jason Ekstrand
bb41d9a1d3 compiler: Add a system value and varying for ViewIndex
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 11:25:46 -07:00
Bartosz Tomczyk
fcf941068e mesa/vbo: reduce prim array size
We always use only single element.

v2: Change single element arrays to variables

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-03 18:22:58 +02:00
Brian Paul
a30313abf6 mesa: add const qualifier on _mesa_valid_to_render()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-05-03 08:48:46 -06:00
Samuel Iglesias Gonsálvez
f57e234fdd i965/vec4: don't modify regioning parameters to the sources of DF align1 instructions
The regioning parameters are now properly set by convert_to_hw_regs()
and we don't need to fix them in the generator. That latter fix
previously done in the generator was strictly speaking wrong for any
non-identity regions.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-03 15:32:39 +02:00
Samuel Iglesias Gonsálvez
aaeb1c99be i965/vec4: fix register width for DF VGRF and UNIFORM
On gen7, the swizzles used in DF align16 instructions works for element
size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that
in the rest of the code and prepare the instructions for this (scalarize_df()),
we need to set it to two again.

However, for DF align1 instructions, a width of 2 is wrong as we are not
reading the data we want. For example, an uniform would have a region of
<0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access
to the first 4.

This patch sets the default one to 4 and then modifies the width of
align16 instruction's DF sources when we translate the logical swizzle
to the physical one.

v2:
- Remove conditional (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-03 15:32:39 +02:00
Samuel Iglesias Gonsálvez
7f728bce81 i965/vec4: fix vertical stride to avoid breaking region parameter rule
From IVB PRM, vol4, part3, "General Restrictions on Regioning
Parameters":

  "If ExecSize = Width and HorzStride ≠ 0, VertStride must
   be set to Width * HorzStride."

In next patch, we are going to modify the region parameter for
uniforms and vgrf. For uniforms that are the source of
DF align1 instructions, they will have <0, 4, 1> regioning and
the execsize for those instructions will be 4, so they will break
the regioning rule. This will be the same for VGRF sources where
we use the vstride == 0 exploit.

As we know we are not going to cross the GRF boundary with that
execsize and parameters (not even with the exploit), we just fix
the vstride here.

v2:
- Move is_align1_df() (Curro)
- Refactor exec_size == width calculation (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-05-03 15:32:39 +02:00
Dave Airlie
3bf3f9866c radv/ac: canonicalize the output for 32-bit float min/max.
This fixes:
dEQP-VK.glsl.builtin.precision.min.*
dEQP-VK.glsl.builtin.precision.max.*
dEQP-VK.glsl.builtin.precision.clamp.*

The problem is the hw doesn't compare denorms properly,
so we have to flush them, even though the spec says
flushing is optional, if you don't flush the results
should be correct.

The -pro driver changes the shader float mode,
it would be nice if llvm could grow that perhaps.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 12:55:34 +10:00
Dave Airlie
83e58b036e radv: flush f32->f16 conversion denormals to zero. (v2)
SPIR-V defines the f32->f16 operation as flushing denormals to 0,
this compares the class using amd class opcode.

Thanks to Matt Arsenault for figuring it out.

This fix is VI+ only, add a TODO for SI/CIK.

This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 12:55:34 +10:00
Bas Nieuwenhuizen
eeff7e1154 radv: Add userspace fence buffer per context.
Having it in the winsys didn't work when multiple devices use
the same winsys, as we then have multiple contexts per queue,
and each context counts separately.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7b9963a28f "radv: Enable userspace fence checking."
2017-05-03 03:10:12 +02:00
Dave Airlie
2a2a21450b radv: enable lower_sub to fix loop unrolling.
Loop unroll asserts if it hits a sub, we don't really want
to lower subs as llvm handles these things, but do this for
now, until we can fix loop unroll to work with subs.

Fixes: 14ae0bfa5 (radv: Add NIR loop unrolling)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 09:03:43 +10:00
Bas Nieuwenhuizen
9e847eedd5 radv: Don't set dynamic state for pipelines with rasterizer dicard.
All of the dynamic states apply to rasterization & fragment processing,
so we don't need to set them if we don't rasterize.

We don't clear the dirty flags for them though, so we don't miss any
updates for the next pipeline with rasterization.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
2017-05-03 00:12:56 +02:00
Dave Airlie
a524704025 radv: flush more stages when semaphore are waiting.
This still doesn't give us complete pWaitDstStageMask support,
but it should provide enough to be correct if not as efficent as
possible.

If we have wait semaphores we must flush between submits and
flush the shaders as well.

This fixes the remaining fails in:
dEQP-VK.synchronization.op.single_queue.semaphore.*ssbo*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 07:21:31 +10:00
Samuel Pitoiset
e0e01895b0 glsl: set vector_elements to 1 for samplers
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.

This will also help for ARB_bindless_texture.

No piglit regressions with RadeonSI.

This time the Intel CI system doesn't report any failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-05-02 22:40:45 +02:00
Eric Anholt
ece06defe7 vc4: Use runtime CPU detection for whether NEON is available.
This will allow Raspbian's ARMv6 builds to take advantage of the new NEON
code, and could prevent problems if vc4 ends up getting used on a v7 CPU
without NEON.

v2: Drop dead NEON_SUFFIX (noted by Erik Faye-Lund)
2017-05-02 13:35:23 -07:00
Eric Anholt
a373f77662 vc4: Use a wrapper file to set VC4_BUILD_NEON instead of CFLAGS.
Android.mk was setting the flag across the entire driver, so we didn't
have non-NEON versions getting built.  This was going to be a problem with
the next commit, when I start auto-detecting NEON support and use the
non-NEON version when appropriate.

Reviewed-by: Rob Herring <robh@kernel.org>
2017-05-02 13:35:23 -07:00
Eric Anholt
463b7d0332 gallium: Enable ARM NEON CPU detection.
I wrote this code with reference to pixman, though I've only decided to
cover Linux (what I'm testing) and Android (seems obvious enough).  Linux
has getauxval() as a cleaner interface to the /proc entry, but it's more
glibc-specific and I didn't want to add detection for that.

This will be used to enable NEON at runtime on ARMv6 builds of vc4.

v2: Actually initialize the temp vars in the Android path (noticed by
    daniels)
v3: Actually pull in the cpufeatures library (change by robher).
    Use O_CLOEXEC.  Break out of the loop when we find our feature.
v4: Drop VFP code, which was confused about what it was detecting and not
    actually used yet.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-05-02 13:35:23 -07:00
Dave Airlie
3c73063974 radv: fix stencil only clears.
If we are clearing stencil only, we still need to provide a
a valid Z output from the vertex shader, we can't rely
on the depth clear value having any meaning, as we use this
for the position output, and it could get clipped, so we
don't end up clearing anything.

Fixes:
dEQP-VK.renderpass.simple.stencil
since I added S8 support.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:31:20 +10:00
Philipp Zabel
b539335e50 renderonly: use drmIoctl
To restart interrupted system calls, use drmIoctl.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:22:53 +02:00
Philipp Zabel
cd8ee259c8 renderonly: drop resources on destroy
The renderonly_scanout holds a reference on its prime pipe resource,
which should be released when it is destroyed. If it was created by
renderonly_create_kms_dumb_buffer_for_resource, the dumb BO also has
to be destroyed.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:19:23 +02:00
Philipp Zabel
ab51cd2f26 renderonly: close transfer prime_fd
prime_fd is only used to transfer the scanout buffer to the GPU inside
renderonly_create_kms_dumb_buffer_for_resource. It should be closed
immediately to avoid leaking the DMA-BUF file handle.

Fixes: 848b49b288 ("gallium: add renderonly library")
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-05-02 22:19:19 +02:00
Dave Airlie
09034aab64 radv/wsi: report presentation error per image request
This ports
0fcb92c17d
anv: wsi: report presentation error per image request

This fixes:
dEQP-VK.wsi.xlib.incremental_present.scale_none.*

Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:11:19 +10:00
Dave Airlie
ce0f692528 radv: minor pahole related improvements.
This just reduces the structs by 4-8 bytes each.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:03:07 +10:00
Dave Airlie
9399870ef0 radv/image: resize some surface members.
Oops meant to be part of previous series.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:03:02 +10:00
Dave Airlie
fe6d9c0825 radv: drop unused surface level members.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:42 +10:00
Dave Airlie
5d0f792f06 radv/image: drop blk_d
This was pretty much unused.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:38 +10:00
Dave Airlie
052487be4c radv: remove some members of radeon surface.
We would be storing this info twice per image, no need to,
remove it from the surface struct.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:35 +10:00
Dave Airlie
7e8d0a402b radv: move some image info into a separate struct.
This is to rework the surface code like radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:17 +10:00
Dave Airlie
d5400a5ec2 radv: provide a helper for comparing an image extents.
This just makes it easier to do the follow in cleanups of the surface.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 05:59:52 +10:00
Daniel Stone
80ac89a952 gbm/dri: Fix sign-extension in modifier query
When we were assembling the unsigned 64-bit query return from its
two signed 32-bit component parts, the lower half was getting
sign-extended into the top half. Be more explicit about what we want to
do.

Fixes gbm_bo_get_modifier() returning ((1 << 64) - 1) rather than
((1 << 56) - 1), i.e. DRM_FORMAT_MOD_INVALID.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-05-02 19:55:13 +01:00
Eric Anholt
fba6559a1e nir: Pick just the channels we want for bitmap and drawpixels lowering.
NIR now validates that SSA references use the same number of channels as
are in the SSA value.

v2: Reword commit message, since the commit didn't land before the
    validation change did.

Fixes: 370d68babc ("nir/validate: Validate that bit sizes and components always match")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Cc: <mesa-stable@lists.freedesktop.org>
2017-05-02 10:24:40 -07:00
Jason Ekstrand
6ef1bd4fa5 anv/tests: Create a dummy instance as well as device
This fixes crashes caused by 35e626bd0e
which made us start referencing the instance in the allocators.  With
this commit, the tests now happily pass again.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100877
Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-05-01 17:06:40 -07:00
Bas Nieuwenhuizen
6681ab1f97 radv: Use correct stage for ready bit.
Set the bit in the same stage as the timestamp, instead always at top of pipe.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
2017-05-02 00:54:44 +02:00
Bas Nieuwenhuizen
568aec29d9 radv: Add top of pipe timestamp queries.
Does not fix brokenness with the ready bit.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-02 00:54:18 +02:00
Bas Nieuwenhuizen
14ae0bfa54 radv: Add NIR loop unrolling.
Not much effect on dota2/talos, but positive on deferred.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Timothy Arceri <timothy.arceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-02 00:09:42 +02:00
Randy Xu
6f21b5601c i965: Solve Android native fence fd double close
The Android native fence in i965 has two fds: _EGLSync::SyncFd and
brw_fence::sync_fd.

The semantics of __DRI2fenceExtensionRec::create_fence_fd are unclear on
whether the DRI driver takes ownership of the incoming fd (which is the
same incoming fd from eglCreateSync).  i965 did take ownership, but all
other Mesa drivers do not; instead, they dup the incoming fd. As
a result, _EGLSync::SyncFd and brw_fence::sync_fd were the same fd, and
both egl_dri2 and i965 believed they owned it. On eglDestroySync, that
led to a double-close.

Fix the double-close by making brw_dri_create_fence_fd dup the incoming
fd, just like the other drivers do.

Signed-off-by: Randy Xu <randy.xu@intel.com>
Test: Run Vulkan and GLES stress test and no crash.
Fixes: 6403e37651 ("i965/sync: Implement fences based on Linux sync_file")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
[chadv: Polish the commit message]
Cc: mesa-stable@lists.freedesktop.org
2017-05-01 14:46:50 -07:00
Eric Anholt
d884d1a654 vc4: Only build the NEON code on arm32.
NEON is sufficiently different on arm64 that we can't just reuse this
code.  Disable it on arm64 for now.

v2: Use PIPE_ARCH_ARM instead, as __ARM_ARCH may be 8 for a 32-bit build
    for a v8 CPU.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: <mesa-stable@lists.freedesktop.org>
2017-05-01 13:27:39 -07:00
Samuel Pitoiset
dec5b27b1b gm107/ir: add a missing assertion in emitISCADD()
For consistency, similar to the other emitters.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-05-01 11:56:49 +02:00
Timothy Arceri
de8e01698f i965: Don't allocate uniform space for samplers
Samplers are encoded into the instruction word, so there's no need to
make space in the uniform file.

Previously matrix_columns and vector_elements were set to 0, making this
else case a no-op. Commit 75a31a20af changed that, causing malloc
corruption in thousands of tests on i965.

Fixes: 75a31a20af ("glsl: set vector_elements to 1 for samplers")

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100871
2017-05-01 07:54:18 +10:00
Emil Velikov
a5c6ca9602 egl: initialise dummy_thread via _eglInitThreadInfo
Considering we cannot make dummy_thread a constant we might as well,
initialise by the same function that handles the actual thread info.

This way we don't need to worry about mismatch between the initialiser
and initialising function.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-04-29 14:40:53 +01:00
Emil Velikov
e5efaeb85c egl: polish dri2_to_egl_attribute_map[]
Annotate the array as static const and use C99 initialiser to populate
it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-29 14:40:09 +01:00
Ilia Mirkin
6af14778a3 gallium/targets: fix bool setting on BE architectures
val_bool and val_int are in a union. val_bool gets the first byte, which
happens to work on LE when setting via the int, but breaks on BE. By
setting the value properly, we are able to use DRI3 on BE architectures.
Tested by running glxgears with a NV34 in a G5 PPC.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: squash the vmwgfx hunk]
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-29 14:32:20 +01:00
Emil Velikov
e5c24adc22 docs: add release calendar page and references to it
Add a page that has information which release is expected when and
associated information.

Reference to it from the "Releasing process" and "Release notes" pages.

v2:
 - Add Andres for 17.0.5
 - Rework table format to include the branch (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-04-29 13:43:06 +01:00
Emil Velikov
b1d45c3366 travis: bump MAKEFLAGS to -j4
The instance should have 2 cores, yet bumping the jobs to 4 should give
us a minor speed improvement.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:39:40 +01:00
Emil Velikov
27a0b383b9 travis: enable wayland support
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:39:40 +01:00
Emil Velikov
0e6a36cd3f travis: add Gallium state-tracker targets
Split into OpenCL and others, since the former is quite time consuming.

v2:
 - explicitly enable/disable components
 - build libvdpau 1.1 requirement
 - enable st/vdpau
 - build libva 1.6.2 (API 0.38) requirement

v3: Drop ubuntu-toolchain-r-test from sources (Andres)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:39:40 +01:00
Emil Velikov
b3f2076549 travis: model scons check target like the make one
Should make things a bit more consistent across the board.

Cc: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:39:40 +01:00
Emil Velikov
7e2af37474 travis: split the make target to three separate ones
Split the target to allow faster builds for each run.

The overall build time will be more, yet Travis runs multiple builds in
parallel so we're limited by the slowest one.

Things are split roughly as:
 - DRI loaders, classic DRI drivers, classic OSMesa, make check
 - All Gallium drivers (minus the SWR) alongside st/dri (mesa)
 - The Vulkan drivers - ANV and RADV, make check (anv)

v2:
 - rework RUN_CHECK to MAKE_CHECK_COMMAND
 - explicitly disable DRI loaders
 - generate linux/memfd.h locally and enable ANV
 - add libedit-dev

v3: Use printf to create the header (Andres).
v4: Really add the libedit + printf hunks.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:38:11 +01:00
Emil Velikov
8479fd8a10 travis: add "make swr" to the build matrix
v2: Quote OVERRIDE variables.
v3: Add missplaced libedit-dev hunk (Andres).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
f55d98ac85 travis: add "scons swr" to the build matrix
Requires GCC 5.0 (due to the C++14 requirement) and LLVM 3.9.

v2: Enable the target, add libedit-dev, rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS, quote OVERRIDE
variables.
v4: Keep check target as-is (Andres)

Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
85ee2c6cfc travis: add separate "scons" and "scons llvm" targets
The former does not require any LLVM, while the latter uses LLVM 3.3.

This way we'll quickly catch any LLVM 3.3+ functionality that gets
introduced where it shouldn't.

Add the full list of addons for each build permutation.

v2: Keep libedit-dev, rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS
v4:
 - Remove llvm-toolchain-trusty-3.3 source (Andres)
 - Keep check target as-is (Andres)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
56ba252e23 travis: split out matrix from env
With next commits we'll add a couple of more options.

v2: Rework check target.
v3: Comment the current check target, add -j4 SCONSFLAGS
v4: Keep check target as-is, will rework with later patch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
abcfea23ad travis: rework "if test" blocks in the script section
Split the "if test" blocks so that we get more sensible output in case
of a failure.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
ae713a7b79 travis: remove unused -dev packages
We effectively override libdrm-dev and libxcb-dri2-0-dev since we build
and install the package locally.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
6431b98c54 travis: automatically manage ccache caching
According to the manual

"If you are using ccache, use:

  language: c # or other C/C++ variants

  cache: ccache

to cache $HOME/.ccache and automatically add /usr/lib/ccache to your
$PATH."

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:35:17 +01:00
Emil Velikov
486f28ba88 travis: enable apt cache
Provides a small, but consistent improvement.
Example numbers of the jobs added later in the series.

"make loaders/classic DRI" - 1s
"scons SWR" - 6s

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:34:55 +01:00
Andres Gomez
29322daef2 travis: add the possibility of using the txc-dxtn library
The txc-dxtn library implements the patented S3 Texture Compression
algorithm.

By default it won't be used but we add the possibility of setting the
USE_TXC_DXTN variable to yes in the travis web UI so it will be
installed and used for the scons tests.

Cc: Eric Anholt <eric@anholt.net>
Cc: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
[Emil Velikov: keep the LIB prefix, drop the LD_LIBRARY_PATH, fold URL]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-29 13:34:53 +01:00
Andres Gomez
7819d265c7 travis: replace Trusty-based LLVM toolchain apt-get with apt addon
Trusty's LLVM toochain repository was whitelisted some time ago. See:
479067c5e7

Signed-off-by: Andres Gomez <agomez@igalia.com>
[Emil Velikov]
 - set sudo to false
 - reference the Trusty change (Rhys)
 - keep libedit-dev
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-29 13:34:53 +01:00
Emil Velikov
cb820daa3f travis: explicitly LD_LIBRARY_PATH the local libraries
Some of the libraries may be dlopened, which may not always work due to
the non-standard prefix that we're using.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-04-29 13:34:53 +01:00
Brian Paul
52d69c2e8d st/wgl: whitespace, formatting fixes in stw_pixelformat.c
Trivial.
2017-04-28 22:01:34 -06:00
Charmaine Lee
ba8e2ea19a st/wgl: allow WGL_BIND_TO_TEXTURE_RGB_ARB for RGBA visuals
We do not need to restrict WGL_BIND_TO_TEXTURE_RGB_ARB to
RGB visuals only. It can be supported with RGBA visuals as well.

This fixes the early exit of cinebench-r15-test trace.

Tested with cinebench-r15, piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-28 22:01:24 -06:00
Brian Paul
d06045dfdd st/wgl: use ARRAY_SIZE() macro in wglChoosePixelFormatARB()
Trivial.
2017-04-28 21:37:07 -06:00
Brian Paul
394f8dacbc st/wgl: whitespace/formatting fixes in stw_ext_pixelformat.c
Trivial.
2017-04-28 21:37:06 -06:00
Neha Bhende
197907c926 svga: implement sRGB rendering for imported surfaces
If texture is imported and templ format is sRGB, use compatible sRGB format
to the imported texture format while creating surface view.

tested with MTT piglit, glretrace, viewperf and conform

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Neha Bhende
1b415a5b28 svga: add function svga_linear_to_srgb()
This function will return compatible svga srgb format for corresponding
linear format

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Neha Bhende
6e06e281c6 glx: add missing sRGB attribute check in fbconfigs_compatible()
This patch will allow driver to choose srgb capable FBconfig
if GLX_FRAMEBUFFER_SRGB_CAPABLE_ARB attribute is 1

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Thomas Hellstrom
ca59fd1706 svga: Add a more elaborate format compatibility determination v2
dri3 is a bit sloppy about its format compatibility requirements, so add
a possibility to import xrgb surfaces as argb textures and vice versa.

At the same time, make the svga_texture_from_handle() function a bit more
readable and fix the error path where we leaked a winsys surface.

v2: Addressed review comments by Brian.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-28 21:03:06 -06:00
Tim Rowley
18d5c452d0 swr/rast: add memory api to SwrGetInterface()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:09 -05:00
Tim Rowley
a46539af11 swr/rast: use gather instruction for odd format fetch
Small fetch performance optimization - use gather instruction
for odd format fetch instead of slow emulated code.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:57:02 -05:00
Tim Rowley
eff909de7d swr/rast: enable SIMD16 8x2 tile backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:56 -05:00
Tim Rowley
5fde2ae533 swr/rast: add SwrInit() to init backend/memory tables
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:50 -05:00
Tim Rowley
e8d58049f6 swr/rast: increment depth/stencil tile pointer in SIMD16 BE
Misplaced #endif preventing depth and stencil hot tile pointers
from incrementing in SIMD16 8x2 configuration of BackendPixelRate.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:42 -05:00
Tim Rowley
d4c1486737 swr/rast: add SwrGetInterface() function to return api
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:34 -05:00
Tim Rowley
dabd0499a6 swr/rast: enable per-warp scratch space for CS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:28 -05:00
Tim Rowley
0424e6249a swr/rast: reduce simd{16}vertex stack for VS output
Frontend - reduce simdvertex/simd16vertex stack usage for VS output in
ProcessDraw, fixes stack overflow in some of the deeper call stacks under
SIMD16.

1. Move the vertex store out of PA_FACTORY, and off the stack
2. Allocate the vertex store out of the aligned heap (pointer is
   temporarily stored in TLS, but will be migrated to thread pool
   along with other frontend temporary buffers).
3. Grow the vertex store as necessary for the number of verts per
   primitive, in chunks of 8/4 simdvertex/simd16vertex

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:17 -05:00
Tim Rowley
536baf507e swr/rast: remove default argument from SwrSync()
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:56:11 -05:00
Tim Rowley
145bf5aa5b swr/rast: remove unused variables in the SIMD16 FE
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:57 -05:00
Tim Rowley
20f3a30219 swr/rast: move construction of const above goto
Fixes gcc error for SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:50 -05:00
Tim Rowley
feefd3ef4e swr/rast: name threads to aid debugging
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:40 -05:00
Tim Rowley
9b907599b6 swr/rast: disable buffer overrun warning for Assemble()
Disabling buffer overrun warning for Assemble(uint32_t slot,
simdvector *verts) due to what looks like a MSVC compiler bug
when compiling the SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:33 -05:00
Tim Rowley
d523b82498 swr/rast: clean up clipper comments
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:26 -05:00
Tim Rowley
8c0e0bf141 swr/rast: add SIMDAPI decorators in binner/clipper
Fixes MSVC errors with SIMD16 FE.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:20 -05:00
Tim Rowley
42d804b2a3 swr/rast: add additional jit utility functions
Not used yet.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:55:02 -05:00
Tim Rowley
a373f1f27a swr/rast: more flexible max attribute slots
Ability to allocate space for an arbitrary number (at compile time)
of positions in the vertex layout.

Removes KNOB_NUM_ATTRIBUTES from knobs.h, replaces the VTX slot
number #defines with the SWR_VTX_SLOTS enum (which contains
replacement for NUM_ATTRIBUTES: SWR_VTX_NUM_SLOTS)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-28 19:53:39 -05:00
Kenneth Graunke
54d42cd976 i965: Drop BRW_NEW_CONTEXT from 3DSTATE_DS/GS on Gen7-7.5.
We already have BRW_NEW_BATCH, which completely covers all the cases
that BRW_NEW_CONTEXT would handle.  Drop it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
1d0e974406 i965: Drop _NEW_TRANSFORM from 3DSTATE_DS/GS on Gen7-7.5.
There's no reason for this as far as I can tell.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
a1f12574b0 i965: Set point rasterization rule to UPPER_RIGHT on Gen6-7.5.
Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not.  We ought to
be consistent - the answer depends on the API, not the hardware generation.

The Sandybridge PRM says about RASTRULE_UPPER_RIGHT:

   "To match OpenGL point rasterization rules (round to +infinity, where
    this is the upper right direction wrt OpenGL screen origin of lower
    left).

So this is likely the one we should use.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
4878ab9bd4 i965: Always set AALINEDISTANCE_TRUE on Sandybridge.
We set this unconditionally on every other platform.  Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Kenneth Graunke
b625bcc601 i965: Use true AA line distance on G45/Ironlake.
The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c.  Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4.  Not exactly "true", but at least
more accurate.

The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used.  The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.

At any rate, we should use the more accurate mode.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-28 17:03:33 -07:00
Andres Gomez
81149c8f52 docs: add news item and link release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-04-29 01:21:17 +03:00
Andres Gomez
e06aec99f2 docs: add sha256 checksums for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 6cb65ce2d3)
2017-04-29 01:20:51 +03:00
Andres Gomez
0ad8c4f375 docs: add release notes for 17.0.5
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 61b134a862)
2017-04-29 01:19:51 +03:00
Marek Olšák
7a515a607c radeonsi: don't load unused compute shader input SGPRs and VGPRs
Basically, don't load GRID_SIZE or BLOCK_SIZE if they are unused, determine
whether to load BLOCK_ID for each component separately, and set the number
of THREAD_ID VGPRs to load. Now we should get the maximum CS launch wave
rate in most cases.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:57:44 +02:00
Marek Olšák
46e48d4044 tgsi/scan: record compute shader system value usage
v2: just do indexing with swizzle[i]

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fa15436e63 radeonsi: add a HUD query for draw calls with primitive restart
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
55445ff189 radeonsi: tell LLVM not to remove s_barrier instructions
LLVM 5.0 removes s_barrier instructions if the max-work-group-size
attribute is not set. What a surprise.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0490074cab radeonsi: fix tess offchip offset for per-patch attributes
We need 4 more bits there. I don't know what is fixed by this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
4e50062028 radeonsi: pass tessellation ring addresses via user SGPRs
This removes s_load_dword latency for tess rings.

We need just 1 SGPR for the address if we use 64K alignment. The final asm
for recreating the descriptor is:

    // s2 is (address >> 16)
    s_mov_b32 s3, 0
    s_lshl_b64 s[4:5], s[2:3], 16
    s_mov_b32 s6, -1
    s_mov_b32 s7, 0x27fac

v2: bitcast the descriptor type from v2i64 to v4i32

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2823e15f60 radeonsi: use si_insert_input_ret in si_llvm_emit_tcs_epilogue
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
9fd9a7d0ba radeonsi: remove VS epilog code, compile VS with PrimID export on demand
The use of PrimID in the pixel shader is too rare to deserve such
a sizable support code.

The initial idea of the VS epilog was to move the clipping code there and
remove it based on states, but optimized variants are now used to do that
and are easier to support, so the VS epilog has turned out to be not so
useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
3b2e93e472 radeonsi: get InstanceID from VGPR1 (or VGPR2 for tess) instead of VGPR3
VGPR1 = InstanceID / StepRate0; // StepRate0 can be set to 1

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
678d568c7b radeonsi: don't load PrimID in TES if it's not used
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
808c33f6f0 radeonsi: explain (non-)monolithic shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fc478248f3 radeonsi/gfx9: enable OpenGL 4.5
Tentatively enable it, expecting the scratch buffer support to be done before
the next Mesa release.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ed9a51cd3b radeonsi/gfx9: 2nd shader of merged shaders should hold a reference of the 1st
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ef40937854 radeonsi: add reference counting for shader selectors
The 2nd shader of merged shaders should take a reference of the 1st shader.
The next commit will do that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6c15e15af4 radeonsi/gfx9: set VGT_VERTEX_REUSE for ES in ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
887ef1de34 radeonsi/gfx9: set TES registers for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
49cd0cbfd5 radeonsi/gfx9: disallow scratch buffer for LS-HS and ES-GS
not implemented yet

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
2857b14bba radeonsi/gfx9: always compile monolithic ES-GS (asynchronously)
In addition to the non-monolithic variant.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
a82398a8f5 radeonsi/gfx9: add support for monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6a9c20fdd5 radeonsi/gfx9: make sure the 1st shader's main part exists for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
7df682c291 radeonsi/gfx9: select shader parts for non-monolithic ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
cd99c442c4 radeonsi/gfx9: add GS prolog support for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
e0570bc283 radeonsi/gfx9: add VS prolog support for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
6b93452b24 radeonsi/gfx9: pass GS input SGPRs and VGPRs from the ES part to GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
37e22ab65e radeonsi/gfx9: store ES outputs to LDS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
d616c57342 radeonsi/gfx9: load GS inputs from LDS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
fc781fa0ab radeonsi/gfx9: get GS wave ID from the correct input
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
bcaf905129 radeonsi/gfx9: add the function signature of merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
8b220877ad radeonsi/gfx9: set registers and shader key for merged ES-GS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
ab197ad8d1 radeonsi/gfx9: add GS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
b2f5d03152 radeonsi: rename declare_tess_lds -> declare_lds_as_pointer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
e3caa1cd36 radeonsi: simplify some shader type conditions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
021e65640e radeonsi: rename the swizzle parameter of lds_store
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
dcea7e5d19 radeonsi: add si_shader::prolog2
For a GS prolog in merged ES-GS.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
eb35238ffe radeonsi/gfx9: move RW_BUFFERS to s[0:1] for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0af00f179e radeonsi/gfx9: add support for monolithic merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0d6d25475d radeonsi/gfx9: set EXEC for non-mono merged shaders, add a barrier between them
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
a84a6feac9 radeonsi/gfx9: don't store the HS control word
GFX9 doesn't have it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
1d90ecd3a5 radeonsi/gfx9: pass inputs from LS to TCS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
cbd1bc2e3e radeonsi/gfx9: add TCS epilog support for merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
f11ced475e radeonsi/gfx9: add VS prolog support for merged LS-HS
HS input VGPRs must be reserved.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
82a0e4f658 radeonsi/gfx9: merged shaders have scratch offset at the beginning
also, screen wasn't initialized for compute shaders

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0c253557b2 radeonsi/gfx9: define LS-HS main shader function prototype
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
852ea69a2d radeonsi: assign VS/TCS/TES/GS shader input parameter locations dynamically
They will vary with merged stages.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
067dacd1b1 radeonsi/gfx9: define and set LS-HS user SGPRs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
0588146cb0 radeonsi/gfx9: set up shader registers for merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
62abdb17bb radeonsi/gfx9: add initial code generation for non-monolithic merged LS-HS
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
c73d9bd643 radeonsi: separate out code for selecting the VS prolog
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
a98c9ba580 radeonsi/gfx9: add si_shader::previous_stage for merged shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
cfb0798bb3 radeonsi/gfx9: enlarge num_input_sgprs in shader keys due to higher hw limit
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
4ab36e0ebc radeonsi/gfx9: update the summary of shader stage configs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
9d6ed572d9 radeonsi: adjust the signature of si_get_vs_prolog_key
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
b1ed3ffc56 radeonsi: separate out VS prolog key generation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
e4542f00ce radeonsi: separate out VS prolog key printing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
983d7e743e radeonsi: code shuffling in si_emit_derived_tess_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
130e198c49 radeonsi: separate out TGSI initialization of si_shader_context
so that we can put multiple different TGSI shaders into one module.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:47:35 +02:00
Marek Olšák
c3f37e9b50 st/mesa: use min_index and max_index directly from vbo
also remove the incorrect comment about primitive restart.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:46:44 +02:00
Marek Olšák
53cd67859d vbo: set min_index = 0 so gallium can use the value directly
We could also remove index_bounds_valid and use max_index != ~0 instead.
Opinions on that are welcome.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-28 21:46:44 +02:00
Matt Turner
ee70937d15 Revert "glsl: reject image qualifiers with non-image types inside uniform blocks"
This reverts commit 24011ead71.

This causes lots of ES 3.1 CTS tests to fail to compile a bit of code
like:

   layout(binding = 0) buffer InOut
   {
        highp uint inputValues[384];
        highp uint outputValues[384];
        coherent highp uint groupValues[64];      <-----
   } sb_inout;

   error: memory qualifiers may only be applied to images
2017-04-28 12:31:20 -07:00
Brian Paul
27469aa72e st/mesa: add more fallback gallium formats for GL integer formats
The VMware driver has a limited set of integer texture formats.  We
often have to fall back to 4-component formats when 1- or 2-component
formats are missing.

This fixes about 8 integer texture Piglit tests with the VMware driver
on Linux.  We've had this code in-house for a long time but I guess it
was never up-streamed to Mesa master.

This shouldn't regress any other drivers since we're either choosing
an earlier format in the list, or failing anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 13:12:31 -06:00
Brian Paul
6b60153f04 mesa: optimize color_buffer_writes_enabled()
Return as soon as we find an existing color channel that's enabled for
writing.  Typically, this allows us to return true on the first loop
iteration intead of doing four iterations.

No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 13:12:31 -06:00
Brian Paul
054fb129e1 st/mesa: whitespace clean-ups in st_manager.c
Trivial.
2017-04-28 13:12:31 -06:00
Matt Turner
b64da3d14e Revert "glsl: set vector_elements to 1 for samplers"
This reverts commit 75a31a20af.

This breaks thousands of tests on i965 with malloc corruption.
2017-04-28 11:48:57 -07:00
Chad Versace
85ca563b58 anv: Drop 'x11' prefix from non-X11 WSI funcs
Drop it from x11_anv_wsi_image_create and x11_anv_wsi_image_free. The
functions are used by Wayland WSI too.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2017-04-28 08:54:45 -07:00
Jason Ekstrand
ebd1bd6998 anv: Alphabetize KHR extensions
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-28 07:41:03 -07:00
Emil Velikov
c0139955fa ac: automake: sort sources list alphabetically
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-28 14:13:01 +01:00
Emil Velikov
ecc39b6650 ac: include all sources in the tarball
Fixes: e2659176ce ("radeonsi/ac: move vertex export remove to common code.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-28 14:13:00 +01:00
Nicolai Hähnle
9d346af322 st/mesa: remove redundant stfb->iface checks
stfb->iface is always non-NULL for an st_framebuffer. These checks
were incorrect, relying on out-of-bounds memory access in the
surface-less case of EGL_KHR_surfaceless_context.

v2: remove redundant stread check (Marek)

Reviewed-by: Marek Olšák <marek@olsak@amd.com> (v2)
2017-04-28 11:34:00 +02:00
Nicolai Hähnle
19b61799e3 st/mesa: don't cast the incomplete framebufer to st_framebuffer
The incomplete framebuffer is set for a surfaceless context. This leads to
the following error in piglit spec@egl_khr_surfaceless_context@viewport:

==26703==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f6886e43240 at pc 0x7f68854db0fd bp 0x7ffca404b3b0 sp 0x7ffca404b3a0
READ of size 8 at 0x7f6886e43240 thread T0
    #0 0x7f68854db0fc in st_viewport ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57
    #1 0x556840176cdb in main tests/egl/spec/egl_khr_surfaceless_context/viewport.c:101
    #2 0x7f688edcf3f0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x203f0)
    #3 0x556840176e19 in _start (/home/nha/amd/piglit/bin/egl-surfaceless-context-viewport+0xe19)

0x7f6886e43240 is located 32 bytes to the left of global variable 'DummyRenderbuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:69:31' (0x7f6886e43260) of size 112
0x7f6886e43240 is located 8 bytes to the right of global variable 'IncompleteFramebuffer' defined in '../../../mesa-src/src/mesa/main/fbobject.c:73:30' (0x7f6886e42de0) of size 1112
SUMMARY: AddressSanitizer: global-buffer-overflow ../../../mesa-src/src/mesa/state_tracker/st_cb_viewport.c:57 in st_viewport

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek@olsak@amd.com>
2017-04-28 11:34:00 +02:00
Nicolai Hähnle
28ec0fc7b8 st/glsl_to_tgsi: make undef_src and undef_dst const 2017-04-28 11:34:00 +02:00
Nicolai Hähnle
6cbb8f99d2 st/glsl_to_tgsi: cleanup using visit_generic_intrinsic
It turns out that explicitly setting the writemask isn't actually
needed; emit_asm does the right thing based on looking at the types.
2017-04-28 11:34:00 +02:00
Nicolai Hähnle
ce55afc4d6 glsl: remove the shader_group_vote and shader_ballot expression ops
They are now no longer used.
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
0aef96e00c glsl: implement arb_shader_ballot builtins using intrinsics 2017-04-28 11:33:59 +02:00
Nicolai Hähnle
2c30ea3fcd glsl: implement arb_shader_group_vote builtins via intrinsics
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
944455217b st/glsl_to_tgsi: implement shader_group_vote and shader_ballot intrinsics
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:59 +02:00
Nicolai Hähnle
99941a9724 glsl: add intrinsics for ARB_shader_group_vote and ARB_shader_ballot
These operations are currently implemented as IR expressions. However,
they cannot be transformed and moved in the way that other IR
expressions can because they have non-trivial interactions with
control-flow.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-28 11:33:58 +02:00
Samuel Pitoiset
24011ead71 glsl: reject image qualifiers with non-image types inside uniform blocks
Fixes the following ARB_shader_image_load_store tests:

format-layout-with-non-image-type.frag
memory-qualifier-with-non-image-type.frag

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:53 +02:00
Samuel Pitoiset
edb4a1ab2d glsl: introduce validate_image_qualifier_for_type() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:13 +02:00
Samuel Pitoiset
80738425e4 glsl: fix error when using format qualifiers with non-image types
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-28 10:43:04 +02:00
Timothy Arceri
22fa3d90a9 util/disk_cache: remove percentage based max cache limit
The more I think about it the more this seems like a bad idea.
When we were deleting old cache dirs this wasn't so bad as it
was unlikely we would ever hit the actual limit before things
were cleaned up. Now that we only start cleaning up old cache
items once the limit is reached the a percentage based max
cache limit is more risky.

For the inital release of shader cache I think its better to
stick to a more conservative cache limit, at least until we
have some way of cleaning up the cache more aggressively.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-28 14:35:27 +10:00
Jason Ekstrand
032861693e anv: Move queues, events, and semaphores to their own file
Things are about to get more complicated, especially as far as
semaphores are concerned.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
9bd1f03487 anv: Implement VK_KHX_external_memory_fd
This commit just exposes the memory handle type.  There's interesting we
need to do here for images.  So long as the user doesn't set any crazy
environment variables such as INTEL_DEBUG=nohiz, all of the compression
formats etc. should "just work" at least for opaque handle types.

v2 (chadv):
  - Rebase.
  - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when
    handleType == 0.
  - Move handleType-independency comments out of handleType-switch, in
    vkGetPhysicalDeviceExternalBufferPropertiesKHX.  Reduces diff in
    future dma_buf patches.

Co-authored-with: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
818b857914 anv: Use the BO cache for DeviceMemory allocations
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
494d6f65a7 anv/allocator: Add a BO cache
This cache allows us to easily ensure that we have a unique anv_bo for
each gem handle.  We'll need this in order to support multiple-import of
memory objects and semaphores.

v2 (Jason Ekstrand):
 - Reject BO imports if the size doesn't match the prime fd size as
   reported by lseek().

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
5d25ac6a4b anv: Implement VK_KHX_external_memory
This is the trivial implementation that just exposes the extension
string but exposes zero external handle types.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Chad Versace
354ca7a1d4 anv: Implement VK_KHX_external_memory_capabilities
This is a complete but trivial implementation. It's trivial becasue We
support no external memory capabilities yet.  Most of the real work in
this commit is in reworking the UUIDs advertised by the driver.

v2 (chadv):
  - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR.
    Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of
    input structs, not the chain of output structs.
  - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the
    input chain and the output chain separately. Reduces diff in future
    dma_buf patches.

Co-authored-with: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
d4d9258b61 anv/physical_device: Rename uuid to pipeline_cache_uuid
We're about to have more UUIDs for different things so this one really
needs to be properly labeled.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
02767cb4ff anv: Refactor device_get_cache_uuid into physical_device_init_uuids
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
35e626bd0e anv: Set EXEC_OBJECT_ASYNC when available
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-27 20:08:46 -07:00
Jason Ekstrand
bd3a9813b9 anv/cmd_buffer: Use the device allocator for QueueSubmit
The command is really operating on a Queue not a command buffer and the
nearest object to that with an allocator is VkDevice.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: "17.0 17.1" <mesa-dev@lists.freedesktop.org>
2017-04-27 20:08:46 -07:00
Timothy Arceri
2bc06767e1 mesa: remove wip framebuffer code
This was added in 34b3b40af9 back in 2006. Seems it wasn't
needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-28 10:19:59 +10:00
Samuel Pitoiset
75a31a20af glsl: set vector_elements to 1 for samplers
I don't see any reasons why vector_elements is 1 for images and
0 for samplers. This increases consistency and allows to clean
up some code a bit.

This will also help for ARB_bindless_texture.

No piglit regressions with RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-27 22:52:21 +02:00
Jan Vesely
b295a52836 clover: Fix build since clang r301442
v2: rename default_ik -> ik_opencl

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-27 12:52:25 -04:00
Timothy Arceri
4e1f3afea9 disk_cache: use block size rather than file size
The majority of cache files are less than 1kb this resulted in us
greatly miscalculating the amount of disk space used by the cache.

Using the number of blocks allocated to the file is more
conservative and less likely to cause issues.

This change will result in cache sizes being miscalculated further
until old items added with the previous calculation have all been
removed. However I don't see anyway around that, the previous
patch should help limit that problem.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-and-Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-27 20:44:00 +10:00
Timothy Arceri
ce41237151 disk_cache: reduce default cache size to 5% of filesystem
Modern disks are extremely large and are only going to get bigger.
Usage has shown frequent Mesa upgrades can result in the cache
growing very fast i.e. wasting a lot of disk space unnecessarily.

5% seems like a more reasonable default.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2017-04-27 20:43:50 +10:00
Dave Airlie
f4743763ce radeon/ac: remove assert causing regression
This assert wasn't in the original radeonsi code but I added
it without totally understanding the original code, it caused
some regressions in variable-indexing tessellation shaders.

Fixes: e2659176 radeonsi/ac: move vertex export remove to common code.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 11:38:54 +01:00
Dave Airlie
550281f934 radeon/ac: fix build on llvm 3.8.1
Add missing include to fix build.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 11:22:12 +01:00
Boyan Ding
63df869f08 nvc0: Enable compute support for Pascal
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:15 +02:00
Boyan Ding
d03bfb078b nvc0: Add new launch descriptor format for GP100
v2:
Also handle the the new format in indirect dispatch
Use compute class check instead of chipset check

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:12 +02:00
Boyan Ding
2e35bd964e nvc0: Fix index of unk fields in nve4_cp_launch_desc
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:10 +02:00
Boyan Ding
4a9f7bfe90 nouveau: Fix indentation of maxwell compute class definitions
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 11:11:07 +02:00
Jason Ekstrand
c43b4bc85e anv: Don't place scratch buffers above the 32-bit boundary
This fixes rendering corruptions in DOOM.  Hopefully, it will also make
Jenkins a bit more stable as we've been seeing some random failures and
GPU hangs ever since turning on 48bit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620
Fixes: 651ec926fc "anv: Add support for 48-bit addresses"
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
2017-04-27 02:04:57 -07:00
Dave Airlie
f205e19e4f radv/ac: eliminate unused vertex shader outputs. (v2)
This is ported from radeonsi, and I can see at least one
Talos shader drops an export due to this, and saves some
VGPR usage.

v2: use shared code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:18:52 +01:00
Dave Airlie
e2659176ce radeonsi/ac: move vertex export remove to common code.
This code can be shared by radv, we bump the max to
VARYING_SLOT_MAX here, but that shouldn't have too
much fallout.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 05:17:47 +01:00
Dave Airlie
9da1045933 radv: fix regression in descriptor set freeing.
Since the host pool changes,

Fixes:
dEQP-VK.api.descriptor_pool.out_of_pool_memory

Fixes: 126d5ad "radv: Use host memory pool for non-freeable descriptors."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-27 10:50:46 +10:00
Timothy Arceri
f8a2d00046 glsl: remove duplicate validation
Varying types have already been validated in
apply_type_qualifier_to_variable() by this point.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-27 08:21:28 +10:00
Timothy Arceri
52c76dbad3 glsl: use without_array() rather than get_scalar_type()
Here get_scalar_type() was just being use to remove the array
after that we converted it back to base_type anyway so just
use the without_array() helper.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-27 08:21:21 +10:00
Brian Paul
28feb63580 svga: fix vertex buffer binding issue
When we ran Viewperf11's Maya-03 test 3 we saw warnings about flushing
the command buffer with mapped buffers.  This happened when transitioning
from hardware rendering to a 'draw' fallback path.

The problem is the util_set_vertex_buffers_count() function doesn't do
exactly what we want in svga_hwtnl_vertex_buffers().  In a case such as
dst_count=2, dst={bufA, bufB}, count=1 and src={bufC}, when the function
returns we'll have dst_count=2 and dst={bufC, bufB}.  What we really want
is dst_count=1 and dst={bufC, NULL}.  As it was, we were telling the svga
device that there were two vertex buffers when in fact we really only
needed one for the subsequent drawing command.

In this particular case, we first did hardware drawing with {bufA, bufB}
then we transitioned to the 'draw' module, consuming vertex data from
bufA and bufB and writing the new vertex data to bufC.  bufA and bufB are
mapped for reading when we flush the command buffer but should not be
referenced by the command buffer.  The above change fixes that.

No Piglit regressions.  Also tested with Viewperf, Google Earth, Heaven,
etc.

VMware bug 1842059

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:38:00 -06:00
Brian Paul
a36a1ea80a gallium/util: reduce util_snprintf() calls in debug_flush_might_flush_cb()
We only need to construct the debug message if the mapped_sync flag is set.
This should make the function faster since the flag is usually false.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:38:00 -06:00
Brian Paul
495840658e gallium/util: add some comments in u_debug_flush.c
Trivial.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
fbda9b905a svga: Removed the unused label 'done' in svga_validate_surface_view()
Trivial fix
2017-04-26 11:37:59 -06:00
Charmaine Lee
019d5d5346 svga: use the winsys interface to invalidate surface
Instead of directly sending the InvalidateGBSurface command,
this patch uses the invalidate_surface interface.

Fixes Linux VM piglit failures including
   ext_texture_array-gen-mipmap, fbo-generatemipmap-array S3TC_DXT1

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
5bd5ec6a0f svga: fix format for screen target
This patch revises the fix in commit 606f13afa31c9f041a68eb22cc32112ce813f944
to properly translate the surface format for screen target.
Instead of changing the svga format for PIPE_FORMAT_B5G6R5_UNORM
to SVGA3D_R5G6B5 for all texture surfaces, this patch only restricts
SVGA3D_R5G6B5 for screen target surfaces. This avoids rendering
failures when specify a non-vgpu10 format in a vgpu10 context with
software renderer.

Fixes piglit failures spec@!opengl 1.1@draw-pixels,
                      spec@!opengl 1.1@teximage-colors gl_r3_g3_b2
                      spec@!opengl 1.1@texwrap formats

Tested Xorg with 16bits depth.
Also tested with MTT piglit, MTT glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
3626112214 svga: cache the backing surface handle in the texture object
CinebenchR15 not only binds the same texture for rendering and sampling,
it actually changes the framebuffer buffer attachment very often, causing
a lot of backed surface view to be created and a lot of surface copies
to be done. This patch caches the backed surface handle
in the texture resource and allows the backed surface view to
reuse the backed surface handle.  With this patch, the number of
backed surface view reduces from 1312 to 3. Unfortunately, this
does not eliminate all the surface copies. There are still surface
copies involved when we switch from original to backed surface handle
for rendering.

Tested with CinebenchR15, NobelClinicianViewer, Turbine, Lightsmark2008,
            MTT glretrace, MTT piglit.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
7f2f695d4d svga: Update the backing resource only if needed
This patch adds a timestamp in svga_surface structure to keep track
of when the backing surface is last sync with the original resource.
This helps to avoid unnecessary surface copy from the original
resource to the backing surface if the original resource has not
since been modified.

This reduces the amount of surface copy with CinebenchR15.

Tested with CinebenchR15, mtt glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
c6576461f5 svga: Set the surface dirty bit for the right surface view
For VGPU10, we will render to a backed surface view when
the same resource is used for rendering and sampling.
In this case, we will mark the dirty bit for the backed surface view.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
dc30ac5c24 svga: Move rendertarget view related fields to hw_clear state
This patch moves the rendertarget view related fields from
svga_hw_draw_state to svga_hw_clear_state where all the hw
framebuffer related state resides.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Charmaine Lee
f482493dcf svga: Move setting the rendered_to flags to framebuffer emit time
Instead of setting the rendered_to flags at set time, this patch
moves the setting of the flags to framebuffer emit time.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
1ee181b354 svga: add const qualifiers on svga_check_sampler_view_resource_collision()
We don't change any of the argument objects.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
0f236ea785 svga: improve surface view debug messages
The old ones were somewhat cryptic.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
943f4f47e0 svga: add DEBUG_SAMPLERS
The debug output in svga_create_sampler_state() was controlled by
DEBUG_VIEWS but that's not consistent with the other debug output for
sampler views.  Create/use a new debug flag just for this.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
577e114e46 svga: fail screen creation if HW version is too old
Tested by verifying 3D acceleration works with HWv8 but not earlier.
For HWv7 and older we get the GDI Generic renderer.

Reviewed-by: Neha Bhende<bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-26 11:37:59 -06:00
Deepak Rawat
8de0452ec4 winsys/svga: fix error path when kernel is not able to create surface
If for some reason kernel is not able to create surface,
when no buffer was provided the function
vmw_svga_winsys_surface_create should return NULL.

This patch fixes the issue where the code was not following the
clean up path in case of error, which used to cause SIGSEGV.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-04-26 11:37:59 -06:00
Brian Paul
75be43ed33 draw: whitespace fixes in draw_pipe_vbuf.c
Remove trailing whitespace, fix formatting, etc.  Trivial.
2017-04-26 11:37:59 -06:00
Brian Paul
4bb19a1514 st/mesa: minor clean-ups in st_update_renderbuffer_surface()
Remove unneeded parens.  Add const qualifiers.  Move var decls closer
to where they're used.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Neha Bhende<bhenden@vmware.com>
2017-04-26 11:37:59 -06:00
Samuel Pitoiset
00b5044740 nv50,nvc0: disable the TGSI merge registers pass
shader-db results on GK106 (Thanks Karol):

total instructions in shared programs : 3931608 -> 3929463 (-0.05%)
total gprs used in shared programs    : 481255 -> 479014 (-0.47%)
total local used in shared programs   : 27481 -> 27381 (-0.36%)
total bytes used in shared programs   : 36031256 -> 36011120 (-0.06%)

                local        gpr       inst      bytes
    helped          14        1471        1309        1309
      hurt           1          88         384         384

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-26 19:15:54 +02:00
Samuel Pitoiset
0bceefc295 radeonsi: disable the TGSI merge registers pass
47109 shaders in 29632 tests
Totals:
SGPRS: 1917364 -> 1916620 (-0.04 %)
VGPRS: 1165802 -> 1165202 (-0.05 %)
Spilled SGPRs: 1880 -> 1843 (-1.97 %)
Spilled VGPRs: 70 -> 65 (-7.14 %)
Private memory VGPRs: 1184 -> 1184 (0.00 %)
Scratch size: 1312 -> 1308 (-0.30 %) dwords per thread
Code Size: 60211356 -> 60192268 (-0.03 %) bytes
LDS: 1077 -> 1077 (0.00 %) blocks
Max Waves: 428597 -> 428674 (0.02 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238173 -> 237429 (-0.31 %)
VGPRS: 149556 -> 148956 (-0.40 %)
Spilled SGPRs: 1263 -> 1226 (-2.93 %)
Spilled VGPRs: 25 -> 20 (-20.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 20 -> 16 (-20.00 %) dwords per thread
Code Size: 10457904 -> 10438816 (-0.18 %) bytes
LDS: 50 -> 50 (0.00 %) blocks
Max Waves: 41283 -> 41360 (0.19 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-26 19:15:40 +02:00
Samuel Pitoiset
066a572955 st/glsl_to_tgsi: disable the merge registers pass conditionally
The main goal of this pass to merge temporary registers in order
to reduce the total number of registers and also to produce
optimal TGSI code.

In fact, compilers seem to be confused when temporary variables
are already merged, maybe because it's done too early in the
process.

Skipping the pass, reduce both the register pressure and the code
size, at least for Nouveau and RadeonSI because they have a real
backend compiler.

Found by luck while fixing an issue in the TGSI dead code elimination
pass which affects tex instructions with bindless samplers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-26 19:15:37 +02:00
Samuel Pitoiset
3a927e0aa3 gallium: add PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-26 19:15:34 +02:00
Samuel Pitoiset
ec301497b8 radeonsi: use unsynchronized transfers for shader binary uploads
Because the buffer is new, it can't be referenced by any CS.

This can save few CPU cycles by skipping the whole
PIPE_TRANSFER_UNSYNCHRONIZED if in amdgpu_bo_map().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-26 19:15:22 +02:00
Marek Olšák
96b0cfc82e radeonsi: turn si_shader_key::mono into a non-union
A merged LS-HS shader needs both fix_fetch and inputs_to_copy
for compilation.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
3f2a0649ab radeonsi: adjust ESGS ring buffer size computation on VI
Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
80814819c2 radeonsi/gfx9: don't set deprecated field PARTIAL_ES_WAVE_ON
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
60a20e6879 radeonsi/gfx9: set MAX_PRIMGRP_IN_WAVE in the correct register
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
8e8570a9e8 radeonsi/gfx9: add a workaround for viewing a slice of 3D as a 2D image
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
482e6b07cc radeonsi/gfx9: fix 1D array shader images
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
5c94779585 radeonsi/gfx9: fix most things wrong with shader images
There are 2 major hw changes:
- The address must always point to the address of level 0. GFX9 tiling
  modes don't allow binding to a non-0 level.
- 3D must always be bound as 3D, because 2D and 3D use entirely different
  tiling modes, and the texture target determines which set of modes is
  used.

Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Marek Olšák
65e0c3fba7 radeonsi/gfx9: fix texture buffer objects and image buffers with IDXEN==0
Cc: 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 13:08:05 +02:00
Eric Engestrom
9d1dbf2aa1 configure: print LDFLAGS alongside CFLAGS & co.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 10:27:17 +01:00
Timothy Arceri
2895d96a05 mesa: tidy up left over APPLE_vertex_array_object semantics
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 10:03:06 +10:00
Timothy Arceri
f38845b9cb mesa: inline bind_vertex_array() helper
The previous commit removed the only other user of this function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 10:03:06 +10:00
Timothy Arceri
7927d0378f mesa: drop APPLE_vertex_array_object support
Shared context support for VAOs was dropped in 0b2750620b.

From the ARB_vertex_array_object spec:

   "This extension differs from GL_APPLE_vertex_array_object
   in that client memory cannot be accessed through a
   non-zero vertex array object.  It also differs in that
   vertex array objects are explicitly not sharable between
   contexts."

Nobody should be using this extension over
ARB_vertex_array_object anymore so just drop it rather than
adding locking back just for VAOs created from these
functions.

For reference the Nvidia blob doesn't expose this extension.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-26 10:03:06 +10:00
Bas Nieuwenhuizen
7b9963a28f radv: Enable userspace fence checking.
v2: - Added some error handling.
    - memset the buffer to 0.

v3: Added assert for buffer size.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-26 01:32:41 +02:00
Matt Turner
ee5f96581a i965: Remove unused variable 'options'
Should have been removed in commit ad55b1a770
2017-04-25 15:28:33 -07:00
Matt Turner
71d11f3998 glsl: Initialize current_var
CID: 1324644 (Uninitialized pointer field)
2017-04-25 15:28:33 -07:00
Dave Airlie
7f77554b5b radv/ac: setup mrt exports then export them in one go. (v2)
Noticed while looking at Sascha Willems deferred shaders.

This is a bit of an llvm workaround, llvm was producing this:
        v_cvt_pkrtz_f16_f32_e64 v4, v7, v8                       ; D2960004 00021107
        v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0                      ; D2960006 0001E509
        s_waitcnt vmcnt(0)                                       ; BF8C0F70
        exp mrt0 v4, v4, v6, v6 compr                            ; C400040F 00000604
        s_waitcnt expcnt(0)                                      ; BF8C0F0F
        v_cvt_pkrtz_f16_f32_e64 v4, v12, v5                      ; D2960004 00020B0C
        v_cvt_pkrtz_f16_f32_e64 v5, v14, 1.0                     ; D2960005 0001E50E
        exp mrt1 v4, v4, v5, v5 compr                            ; C400041F 00000504
        s_waitcnt expcnt(0)                                      ; BF8C0F0F
        v_cvt_pkrtz_f16_f32_e64 v0, v0, v1                       ; D2960000 00020300
        v_cvt_pkrtz_f16_f32_e64 v1, v2, v3                       ; D2960001 00020702
        exp mrt2 v0, v0, v1, v1 done compr vm                    ; C4001C2F 00000100

After this change:
        v_cvt_pkrtz_f16_f32_e64 v4, v7, v8                       ; D2960004 00021107
        s_waitcnt vmcnt(0)                                       ; BF8C0F70
        v_cvt_pkrtz_f16_f32_e64 v0, v0, v1                       ; D2960000 00020300
        v_cvt_pkrtz_f16_f32_e64 v6, v9, 1.0                      ; D2960006 0001E509
        v_cvt_pkrtz_f16_f32_e64 v5, v12, v5                      ; D2960005 00020B0C
        v_cvt_pkrtz_f16_f32_e64 v7, v14, 1.0                     ; D2960007 0001E50E
        exp mrt0 v4, v4, v6, v6 compr                            ; C400040F 00000604
        v_cvt_pkrtz_f16_f32_e64 v1, v2, v3                       ; D2960001 00020702
        exp mrt1 v5, v5, v7, v7 compr                            ; C400041F 00000705
        exp mrt2 v0, v0, v1, v1 done compr vm                    ; C4001C2F 00000100

No waitcnt for exports are emitted.

v2: fixup index->mrt mapping (Bas).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-25 23:26:11 +01:00
Dave Airlie
b2cedb3ea9 radv/ac: overhaul vs output/ps input routing
In order to cleanly eliminate exports rewrite the
code first to mirror how radeonsi works for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-25 23:24:39 +01:00
Dave Airlie
b858cb4df8 radv/ac: move point coord after layer/viewport.
These need to be ordered as per shader enum ordering, I'll
rewrite this soon, but this is a bug fix.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-25 23:24:21 +01:00
Samuel Pitoiset
1c66522ecc gallium: remove u_caps.c/h interface
No longer used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 23:26:44 +02:00
Marek Olšák
04d7978b8c ddebug: implement get_query_result_resource
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 22:39:31 +02:00
Marek Olšák
231dfa5a02 trace: don't trace resource_destroy
due to the lack of pipe_resource wrapping, we can get this call from inside
of driver calls, which would try to lock an already-locked mutex.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 22:39:31 +02:00
Marek Olšák
2c1ec23a06 gallium/util: add debugging helpers printing pipeline statistics
typically useful for hw bring-up

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 22:39:31 +02:00
Rob Herring
26a36c1af7 Android: fix r300g only build
If r300g is the only radeon driver built, the Android build fails to
build:

ninja: error:
'out/target/product/linaro_x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_radeon_intermediates/export_includes',
needed by
'out/target/product/linaro_x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/import_includes',
missing and no known rule to make it

This is because the path to build libmesa_pipe_radeon was only getting
added for r600g and radeonsi, but the library dependency was added for
all radeon drivers. As libmesa_pipe_radeon is not needed for r300g, drop
the library dependency.

Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-25 17:08:06 +01:00
Timothy Arceri
347fe24f82 mesa: use locked version of HashWalk for xfb objects
From Chapter 5 'Shared Objects and Multiple Contexts' of
the OpenGL 4.5 spec:

   "Objects which contain references to other objects include
   framebuffer, program pipeline, query, transform feedback,
   and vertex array objects.   Such objects are called container
   objects and are not shared"

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 09:58:47 +10:00
Timothy Arceri
a82d6a307d mesa: create locked version of HashWalk
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-25 09:58:39 +10:00
Rafael Antognolli
6a40ccec4b genxml: Fix gen_pack_header.py crash when field type is invalid.
Just return earlier in that case. Also set prefix to an empty string, so
we don't get to use it undefined.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 15:14:12 -07:00
Rafael Antognolli
9670124e31 genxml: Make BLEND_STATE command support variable length array.
We need to emit BLEND_STATE, which size is 1 + 2 * nr_draw_buffers
dwords (on gen8+), but the BLEND_STATE struct length is always 17. By
marking it size 1, which is actually the size of the struct minus the
BLEND_STATE_ENTRY's, we can emit a BLEND_STATE of variable number of
entries.

For gen6 and gen7 we set length to 0, since it only contains
BLEND_STATE_ENTRY's, and no other data.

With this change, we also change the code for blorp and anv to emit only
the needed BLEND_STATE_ENTRY's, instead of always emitting 16 dwords on
gen6-7 and 17 dwords on gen8+.

v2:
   - Use designated initializers on blorp and remove 0 from
   initialization (Jason)
   - Default entries to disabled on Vulkan (Jason)
   - Rebase code.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 15:14:10 -07:00
Rafael Antognolli
4ace73b1f6 genxml: Fix python crash when no dwords are found.
If the 'dwords' dict is empty, max(dwords.keys()) throws an exception.
This case could happen when we have an instruction that is only an array
of other structs, with variable length.

v2:
   - Add another clause for empty dwords and make it work with python 3
   (Dylan)
   - Set the length to 0 if dwords is empty, and do not declare dw

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 15:14:08 -07:00
Rafael Antognolli
19720405d5 genxml: Remove unused parameter.
'start' parameter from Group.emit_pack_function() is useless.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 15:14:05 -07:00
Rafael Antognolli
1ea41163eb intel/aubinator: Correctly read variable length structs.
Before this commit, when a group with count="0" is found, only one field
is added to the struct representing the instruction. This causes only
one entry to be printed by aubinator, for variable length groups.

With this commit we "detect" that there's a variable length group
(count="0") and store the offset of the last entry added to the struct
when reading the xml. When finally reading the aubdump file, we check
the size of the group and whether we have variable number of elements,
and in that case, reuse the last field to add the remaining elements.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 15:13:51 -07:00
Nanley Chery
50134cede1 isl/format: Update the R16G16B16X16_FLOAT entry
The section of the PRM mentioned in the code comment above this table
says that this format supports the render target write message. Internal
documentation says that this format also supports alpha blending. As a
side effect, this allows CCS_D buffers to be created for images with
this format.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-24 13:30:50 -07:00
Nanley Chery
b1066f7365 anv/pass: Delete anv_pass::subpass_attachments
This field has no users.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-24 13:30:50 -07:00
Francisco Jerez
58324389be intel/fs: Take into account amount of data read in spilling cost heuristic.
Until now the spilling cost calculation was neglecting the amount of
data read from the register during the spilling cost calculation.
This caused it to make suboptimal decisions in some cases leading to
higher memory bandwidth usage than necessary.

Improves Unigine Heaven performance by ~4% on BDW, reversing an
unintended FPS regression from my previous commit
147e71242c with n=12 and statistical
significance 5%.  In addition SynMark2 OglCSDof performance is
improved by an additional ~5% on SKL, and a Kerbal Space Program
apitrace around the Moho planet I can provide on request improves by
~20%.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-24 11:01:40 -07:00
Francisco Jerez
ecc19e12dc intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.
This is what we use later on to compute the number of registers that
will actually get spilled to memory, so it's more likely to match
reality than the current open-coded approximation.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-24 10:59:56 -07:00
Kenneth Graunke
6b10c37b9c i965/vec4: Use reads_accumulator_implicitly(), not MACH checks.
Curro pointed out that I should not just check for MACH, but use
the reads_accumulator_implicitly() helper, which would also prevent
the same bug with MAC and SADA2 (if we ever decide to use them).

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-24 10:53:49 -07:00
Mauro Rossi
11db3d10bb android: radv/ac: Fix nir.h include
Fixes following building errors due to missing include paths:

external/mesa/src/amd/common/ac_shader_info.c:23:10: fatal error: 'nir/nir.h' file not found
         ^

external/mesa/src/compiler/nir/nir.h:48:10: fatal error: 'nir_opcodes.h' file not found
         ^

Fixes: 224cf29 "radv/ac: add initial pre-pass for shader info gathering"
Acked-by: Dave Airlie <Airlied@redhat.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-24 18:01:03 +01:00
Vinson Lee
b81d85f175 configure.ac: Fix typos.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-04-23 22:23:22 -07:00
Dave Airlie
fed740eafe radv/ac: copy llvm machine feature flags from radeonsi.
This just updates this to use the same flags as radeonsi
for consistency.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-24 05:55:44 +01:00
Timothy Arceri
794ae44095 i965: remove now unused GLSL IR optimisations
These are no longer used since the previous commit.

Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
ad55b1a770 i965: remove GLSL IR optimisation loop
IVB is running into some spilling issues in piglit with the
loop removed. However those tests are not really reflective
of a real world use case, also fp64 is brand new to IVB
so we leave the spilling issues to be resolved at a later
time.

Run time for shader-db on my machine goes from ~795 seconds to
~665 seconds.

shader-db results BDW:

total instructions in shared programs: 12969459 -> 12968891 (-0.00%)
instructions in affected programs: 1463154 -> 1462586 (-0.04%)
helped: 3622
HURT: 3326

total cycles in shared programs: 246453572 -> 246504318 (0.02%)
cycles in affected programs: 208842622 -> 208893368 (0.02%)
helped: 24029
HURT: 35407

total loops in shared programs: 2931 -> 2931 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 14560 -> 14498 (-0.43%)
spills in affected programs: 2270 -> 2208 (-2.73%)
helped: 17
HURT: 2

total fills in shared programs: 19671 -> 19632 (-0.20%)
fills in affected programs: 2060 -> 2021 (-1.89%)
helped: 17
HURT: 2

LOST:   17
GAINED: 40

Most of the hurt shaders are 1-2 instructions, with what looks like a max of 7.

I've looked at the worst cycles regressions and as far as I can tell its just
a scheduling difference.

Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
21173194db glsl: use ARB_enhahnced_layouts for packing where possible
If packing doesn't cross locations we can easily make use of
ARB_enhanced_layouts to do packing rather than using the GLSL IR
lowering pass lower_packed_varyings().

Shader-db Broadwell results:

total instructions in shared programs: 12977822 -> 12977819 (-0.00%)
instructions in affected programs: 1871 -> 1868 (-0.16%)
helped: 4
HURT: 3

total cycles in shared programs: 246567288 -> 246567668 (0.00%)
cycles in affected programs: 1370386 -> 1370766 (0.03%)
helped: 592
HURT: 733

Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
eb8aa93c03 glsl: disable varying packing for varying used by interpolateAt*
Currently the NIR backends depend on GLSL IR copy propagation to
fix up the interpolateAt* function params after varying packing
changes the shader input to a global. It's possible copy propagation
might not always do what we need it too, and we also shouldn't
depend on optimisations to do this type of thing for us.

I'm not sure if the same is true for TGSI, but the following
commit should re-enable packing for most cases in a safer way,
so we just disable it everywhere.

No change in shader-db for i965 (BDW)

Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
aa021d50c0 glsl_to_nir: skip ir_var_shader_shared variables
These should be lowered away in GLSL IR but if we don't get dead
code to clean them up it causes issues in glsl_to_nir.

We wan't to drop as many GLSL IR opts in future as we can so this
makes glsl_to_nir just ignore the vars if it sees them.

In future we will want to just use the nir lowering pass that
Vulkan currently uses.

Acked-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
7a7ee40c2d nir/i965: add before ffma algebraic opts
This shuffles constants down in the reverse of what the previous
patch does and applies some simpilifications that may be made
possible from doing so.

Shader-db results BDW:

total instructions in shared programs: 12980814 -> 12977822 (-0.02%)
instructions in affected programs: 281889 -> 278897 (-1.06%)
helped: 1231
HURT: 128

total cycles in shared programs: 246562852 -> 246567288 (0.00%)
cycles in affected programs: 11271524 -> 11275960 (0.04%)
helped: 1630
HURT: 1378

V2: mark float opts as inexact

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
fb2269fed1 nir: shuffle constants to the top
V2: mark float opts as inexact

If one of the inputs to an mul/add is the result of another
mul/add there is a chance that we can reuse the result of that
mul/add in other calls if we do the multiplication in the right
order.

Also by attempting to move all constants to the top we increase
the chance of constant folding.

For example it is a fairly common pattern for shaders to do something
similar to this:

  const float a = 0.5;
  in vec4 b;
  in float c;

  ...

  b.x = b.x * c;
  b.y = b.y * c;

  ...

  b.x = b.x * a + a;
  b.y = b.y * a + a;

So by simply detecting that constant a is part of the multiplication
in ffma and switching it with previous fmul that updates b we end up
with:

  ...

  c = a * c;

  ...

  b.x = b.x * c + a;
  b.y = b.y * c + a;

Shader-db results BDW:

total instructions in shared programs: 13011050 -> 12967888 (-0.33%)
instructions in affected programs: 4118366 -> 4075204 (-1.05%)
helped: 17739
HURT: 1343

total cycles in shared programs: 246717952 -> 246410716 (-0.12%)
cycles in affected programs: 166870802 -> 166563566 (-0.18%)
helped: 18493
HURT: 7965

total spills in shared programs: 14937 -> 14560 (-2.52%)
spills in affected programs: 9331 -> 8954 (-4.04%)
helped: 284
HURT: 33

total fills in shared programs: 20211 -> 19671 (-2.67%)
fills in affected programs: 12586 -> 12046 (-4.29%)
helped: 286
HURT: 33

LOST:   39
GAINED: 33

Some of the hurt will go away when we shuffle things back down to the
bottom in the following patch. It's also noteworthy that almost all of the
spill changes are in Deus Ex both hurt and helped.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri
83f7fdf83a nir: add flt comparision simplification
Didn't turn out as useful as I'd hoped, but it will help alot more on
i965 by reducing regressions when we drop brw_do_channel_expressions()
and brw_do_vector_splitting().

I'm not sure how much sense 'is_not_used_by_conditional' makes on
platforms other than i965 but since this is a new opt it at least
won't do any harm.

shader-db BDW:

total instructions in shared programs: 13029581 -> 13029415 (-0.00%)
instructions in affected programs: 15268 -> 15102 (-1.09%)
helped: 86
HURT: 0

total cycles in shared programs: 247038346 -> 247036198 (-0.00%)
cycles in affected programs: 692634 -> 690486 (-0.31%)
helped: 183
HURT: 27

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Bas Nieuwenhuizen
18947fde7a radv: Enable lowering fdiv in nir.
Results in faster code than the lowering by LLVM.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-23 20:38:06 +02:00
Rob Clark
0012a98c0e freedreno/a5xx: hack for r8g8b8a8_snorm
Blob won't render to this format, and sampling from it it uses the same
fmt value for r8g8b8_snorm and r8g8b8a8_snorm.  But this is what is what
blocks us from jumping from gl30/gles20 to gl31/gles30.  So a hack it
is!

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-23 13:03:25 -04:00
Rob Clark
c21fc881ed freedreno/a5xx: rgtc formats
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-23 13:03:25 -04:00
Marek Olšák
070072ad43 mesa: replace _mesa_index_buffer::type with index_size
This avoids repeated translations of the enum.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-22 22:51:15 +02:00
Bas Nieuwenhuizen
e137b9eed9 radv: Use the correct pipeline for dispatches.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: ec15e0d30 "radv: optimise compute shader grid size emission."
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-22 20:26:59 +01:00
Wladimir J. van der Laan
9da0cd56c3 etnaviv: Supertiled texture support on gc3000
Support supertiled textures on hardware that has the appropriate
feature flag SUPERTILED_TEXTURE.

Most of the scaffolding was already in place in etna_layout_multiple:

   case ETNA_LAYOUT_SUPER_TILED:
      *paddingX = 64;
      *paddingY = 64;
      *halign = TEXTURE_HALIGN_SUPER_TILED;

So this is just a matter of allowing it.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-22 17:49:29 +02:00
Fabio Estevam
53e39f6df4 etnaviv: etnaviv_fence: Simplify the return code logic
The return code can be simplified by using the logical not operator.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-22 17:48:35 +02:00
Rob Clark
e769349fc6 freedreno/a5xx: occlusion query
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:02 -04:00
Rob Clark
52d2fa37f5 freedreno: drop ring arg from _set_stage()
It is always the draw ring.  Except for a5xx queries like time-elapsed,
where we will eventually want to emit cmds into both binning and draw
rings.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:02 -04:00
Rob Clark
5923780b2a freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:02 -04:00
Rob Clark
d310ea0f32 freedreno: add support for hw accumulating queries
Some queries on a4xx and all queries on a5xx can do result accumulation
on CP so we don't need to track per-tile samples.  We do still need to
handle pausing/resuming while switching batches (in case the query is
active over multiple draws which are executed out of order).

So introduce new accumulated-query helpers for these sorts of queries,
since it doesn't really fit in cleanly with the original query infra-
structure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:02 -04:00
Rob Clark
935623af14 freedreno: a bit of query refactor
Move a bit more of the logic shared by all query types (active tracking,
etc) into common code.  This avoids introducing a 3rd copy of that logic
for a5xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:02 -04:00
Rob Clark
df63ff4d82 freedreno: make hw-query a helper
For a5xx (and actually some queries on a4xx) we can accumulate results
in the cmdstream, so we don't need this elaborate mechanism of tracking
per-tile query results.  So make it into vfuncs so generation specific
backend can use it when it makes sense.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-22 10:03:01 -04:00
Kenneth Graunke
2faf227ec2 i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().
opt_register_coalesce() was optimizing sequences such as:

   mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
   mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
   mov(8) m4.zw:F, vgrf5.xxxy:F

into:

   mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
   mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D

This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well.  Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy.  But the MACH instruction expects
.z to contain attr18.x * attr19.x.  Bogus results ensue.

No change in shader-db on Haswell.  Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-22 00:01:16 -07:00
Timothy Arceri
d682f8aa8e mesa: validate sampler type across the whole program
Currently we were only making sure types were the same within a
single stage. This looks to have regressed with 953a0af8e3.

Fixes: 953a0af8e3 ("mesa: validate sampler uniforms during gluniform calls")

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
https://bugs.freedesktop.org/show_bug.cgi?id=97524
2017-04-22 10:01:15 +10:00
Timothy Arceri
918cec8cbe mesa: don't lock hashtables that are not shared across contexts
From Chapter 5 'Shared Objects and Multiple Contexts' of
the OpenGL 4.5 spec:

   "Objects which contain references to other objects include
   framebuffer, program pipeline, query, transform feedback,
   and vertex array objects.   Such objects are called container
   objects and are not shared"

For we leave locking in place for framebuffer objects because
the EXT fbo extension allowed sharing.

We could maybe just replace the hash with an ordinary hash table
but for now this should remove most of the unnecessary locking.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-22 10:01:15 +10:00
Matt Turner
ef6af0d5f7 mesa: Remove deleteFlag pattern from container objects.
This pattern was only useful when we used mutex locks, which the previous
commit removed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-22 10:01:15 +10:00
Matt Turner
0b2750620b mesa: Remove unnecessary locking from container objects.
From Chapter 5 'Shared Objects and Multiple Contexts' of
the OpenGL 4.5 spec:

   "Objects which contain references to other objects include
   framebuffer, program pipeline, query, transform feedback,
   and vertex array objects.   Such objects are called container
   objects and are not shared"

For we leave locking in place for framebuffer objects because
the EXT fbo extension allowed sharing.

V2: (Timothy Arceri)
 - rebased and dropped changes to framebuffer objects

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-22 10:01:15 +10:00
Timothy Arceri
622a68ed3e mesa: remove fallback RefCount == 0 pattern
We should never get here if this is 0 unless there is a
bug. Replace the check with an assert.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-22 10:01:15 +10:00
Elie TOURNIER
0cc8c81902 egl: add gitignore
Since commit ce562f9e3f, two new files are generated.
We don't want to track them.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-22 00:42:38 +01:00
Samuel Pitoiset
a7bc51aef8 glsl: make use of glsl_type::is_float()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:34:15 +02:00
Samuel Pitoiset
cacc823c39 glsl: make use of glsl_type::is_double()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:34:12 +02:00
Samuel Pitoiset
100721959b glsl: make use of glsl_type::is_integer_64()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:57 +02:00
Samuel Pitoiset
362d9de29c glsl: simplify glsl_type::is_integer_32_64()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:42 +02:00
Samuel Pitoiset
87be9faa78 glsl: add glsl_type::is_integer_64()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:40 +02:00
Samuel Pitoiset
60caca3019 glsl: make use of glsl_type::is_boolean()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:38 +02:00
Samuel Pitoiset
64db02b5fa glsl: make use of glsl_type::is_record()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:36 +02:00
Samuel Pitoiset
cd78ab55d0 glsl: make use of glsl_type::is_interface()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:34 +02:00
Samuel Pitoiset
0c8898dc34 glsl: make use of glsl_type::is_array()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:32 +02:00
Samuel Pitoiset
053912382e glsl: make use glsl_type::is_atomic_uint()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:29 +02:00
Samuel Pitoiset
993a05f0eb glsl: add glsl_type::is_atomic_uint() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-21 19:33:27 +02:00
Emil Velikov
52df318d61 mesa/glthread: correctly compare thread handles
As mentioned in the manual - comparing pthread_t handles via the C
comparison operator is incorrect and pthread_equal() should be used
instead.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: d8d81fbc31 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-21 13:39:57 +01:00
Emil Velikov
dd6ec78b4f st/clover: add space between < and ::
As pointed out by compiler

./llvm/codegen.hpp:52:22: error: ‘<::’ cannot begin a template-argument list [-fpermissive]
./llvm/codegen.hpp:52:22: note: ‘<:’ is an alternate spelling for ‘[’. Insert whitespace between ‘<’ and ‘::’

Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
2017-04-21 13:39:57 +01:00
Samuel Pitoiset
862361c4f5 glsl: get rid of values_for_type()
This function is actually a wrapper for component_slots()
and it always returns 1 (or N) for samplers. Since
component_slots() now return 1 for samplers, it can go.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-21 10:08:32 +02:00
Samuel Pitoiset
4a0aa0b3b3 glsl: make component_slots() returns 1 for sampler types
It looks inconsistent to return 1 for image types and 0 for
sampler types. Especially because component_slots() is mostly
used by values_for_type() which always returns 1 for samplers.

For bindless, this value will be bumped to 2 because the
ARB_bindless_texture states that bindless samplers/images
should consume two components.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-21 10:08:04 +02:00
Kai Wasserbäch
29582dd20c docs/features: mark KHR_no_error as started
The OpenGL extension KHR_no_error is exposed since commit
d42d150ad2 by Timothy Arceri. Therefore it
should be marked as "started" in the features.txt

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-21 09:39:38 +02:00
Tapani Pälli
ae6cbdede0 Revert "android: fix segfault within swap_buffers"
This reverts commit 4d4558411d.

This was a wrong call, while it fixed issue with 3DMark it
actually introduced regression elsewhere.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
2017-04-21 10:03:58 +03:00
Ilia Mirkin
da0a80804c nvc0: Add support for setting viewport index/layer from VS/TES
This enables support on GM200+ for:
 - GL_AMD_vertex_shader_layer
 - GL_AMD_vertex_shader_layer_viewport_index
 - GL_ARB_shader_viewport_layer_array

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
[lyude: add relnotes/TES cap]
Signed-off-by: Lyude <lyude@redhat.com>
[imirkin: move relnotes to right place, add features.txt]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-20 23:24:06 -04:00
Lyude
214f96c1e7 nvc0/ir: Only store viewport in scratch register for GP
EMIT only applies to geometry shaders. For everything else, we want to
export the viewport normally.

Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-20 23:24:06 -04:00
Bas Nieuwenhuizen
0e91d8f38c radv: Prefetch compute shader too.
For consistency, doesn't really impact performance.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-21 00:59:02 +02:00
Jason Ekstrand
1e21d4227e anv/query: Use genxml for MI_MATH
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed by: Iago Toral Quiroga <itoral@igalia.com>
2017-04-20 15:24:06 -07:00
Jason Ekstrand
e23129ac0c genxml: Add better support for MI_MATH
This breaks the guts of MI_MATH (the instruction part) out into its own
structure with proper named values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed by: Iago Toral Quiroga <itoral@igalia.com>
2017-04-20 15:24:06 -07:00
Jason Ekstrand
b7a2af8e38 genxml/pack: Allow hex values in the XML
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2017-04-20 15:24:06 -07:00
Dave Airlie
35ea0c07a1 radv/ac: use tex_lz if we can.
Looking at some Talos shaders vs radeonsi, I noticed they use
tex_lz in a few places, so we should be able to as well.

Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-20 22:00:13 +01:00
Marek Olšák
d1608d6982 st/mesa: use one big translation table in st_pipe_vertex_format
for lower overhead.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
86f99c1e4c st/mesa: check in advance in st_draw_vbo whether the bitmap cache is empty
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
1fb5bc83f1 st/mesa: put the bitmap_cache structure inside st_context
This is nicer on caches, and the next commit will need to access
the structure from a different place.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
69423dcf23 st/mesa: inline and optimize st_invalidate_readpix_cache
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
7cd6e2df65 st/mesa: invalidate the readpix cache in st_indirect_draw_vbo
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
4219e09343 gallium/util: remove util_draw_range_elements helper
min/max_index are typically hints for the u_vbuf module, not the driver.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Marek Olšák
707d2e8b3e gallium: fold u_trim_pipe_prim call from st/mesa to drivers
Most drivers don't need it and shouldn't need it because it can't be used
in some cases (indirect draws, primitive restart, count from streamout).

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-20 20:11:35 +02:00
Samuel Iglesias Gonsálvez
2beff74314 docs/envvars: sort INTEL_DEBUG envvar options by name
It helps to find the envvar option you are looking for.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-04-20 16:27:31 +02:00
Christoph Haag
a9d27c8a33 ac: fix build after LLVM 5.0 SVN r300718
v2: previously getWithDereferenceableBytes() exists, but addAttr() doesn't take that type

Signed-off-by: Christoph Haag <haagch+mesadev@frickel.club>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-and-reviewed-by: Mike Lothian <mike@fireburn.co.uk>
2017-04-20 10:58:19 +02:00
Juan A. Suarez Romero
3af7f8275b bin/get-{extra,fixes}-pick-list.sh: improve output
Show the commit hash and the title in a way that it is easier to copy
and paste in the bin/.cherry-ignore-extra file if we want to ignore
those commits for the future.

v2:
- Use printf instead echo (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-20 10:28:54 +02:00
Juan A. Suarez Romero
99b41631bb bin/get-{extra,fixes}-pick-list.sh: add support for ignore list
Both scripts does not use a file with the commits to ignore. So if we
have handled one of the suggested commits and decided we won't pick it,
the scripts will continue suggesting them.

v2:
- Mark the candidates in bin/get-extra-pick-list.sh (Juan A. Suarez)
- Use bin/.cherry-ignore to store rejected patches (Emil)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-20 10:28:21 +02:00
Brian Paul
8a7e3693c8 mesa: print target string in glBindTexture() error message
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-19 19:57:32 -06:00
Brian Paul
9bfecb03c5 mesa: fix Windows build error related to getuid()
getuid() and geteuid() are not present on Windows.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-19 19:55:29 -06:00
Tim Rowley
dd4488ea6c swr: simd16 vs work
Build VS with alternating output for the current simd16 fe double-pump
of a simd8 shader.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-19 19:01:48 -05:00
Bas Nieuwenhuizen
6bb1ed6bcc radv: Set variant code_size when created from the cache.
Signed-off-by: Bas Nieeuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-20 01:01:49 +02:00
Bas Nieuwenhuizen
1e1165389c radv: Add shader prefetch.
Gives me approximately a 2% perf increase in bot dota2 & talos.

Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-19 23:47:27 +02:00
Bas Nieuwenhuizen
74d92e547c radv: Remove binding buffer count.
In cases where it is used it is always 1.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen
f7b14ff4be radv: Don't try to find gaps for non-freeable descriptors.
With this we don't have any operations on a pool with non-freeable
descriptors left that have O(#descriptors) complexity.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen
126d5adb11 radv: Use host memory pool for non-freeable descriptors.
v2: Handle out of pool memory error.
v3: Actually use VK_ERROR_OUT_OF_POOL_MEMORY_KHR for the error condition.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
2017-04-19 20:37:57 +02:00
Bas Nieuwenhuizen
39644fa40a radv: Don't allocate dynamic descriptors separately.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Bas Nieuwenhuizen <basni@google.com>
2017-04-19 20:37:57 +02:00
Emil Velikov
51c0c213b7 st/mesa: automake: honour the vdpau header install location
If VDPAU is installed in the non-default location, we'll fail to find
the headers and error at build time.

../../src/gallium/include/state_tracker/vdpau_dmabuf.h:37:25: fatal error: vdpau/vdpau.h: No such file or directory
 #include <vdpau/vdpau.h>
                         ^

Fixes: faba96bc60 ("st/vdpau: add new interop interface")
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 12:19:46 +01:00
Emil Velikov
309f4067a7 winsys/sw/dri: don't use GNU void pointer arithmetic
Resolves build issues like the following:

src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:31: error: pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith]
        data = dri_sw_dt->data + (dri_sw_dt->stride * box->y) + box->x * blsize;
                               ^
src/gallium/winsys/sw/dri/dri_sw_winsys.c:203:62: error: pointer of type ‘void *’ used in arithmetic [-Werror=pointer-arith]
        data = dri_sw_dt->data + (dri_sw_dt->stride * box->y) + box->x * blsize;
                                                              ^

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 12:19:38 +01:00
Emil Velikov
4516bfbd30 configure.ac: check require_basic_egl only if egl enabled
Fixes: 1ac40173c2 ("configure.ac: simplify EGL requirements for drivers dependent on EGL")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 12:19:24 +01:00
Emil Velikov
179e21a720 configure.ac: manually expand PKG_CHECK_VAR
The macro is introduced with pkgconfig v0.28 which isn't universally
available. Thus it will error at configure stage.

Reported-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Fixes: ce562f9e3f ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-19 12:18:29 +01:00
Timothy Arceri
1787a3163f mesa: add KHR_no_error support to glVertexAttribDivisor()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
f27f699672 mesa/vbo: add KHR_no_error support to DrawElements*() functions
V2: move MESA_VERBOSE checks back into the common code path.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
3d08e18731 mesa/vbo: add KHR_no_error support to vbo_exec_DrawArrays*()
V2: add missing FLUSH_CURRENT() to no_error path

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
4df2931a87 mesa/vbo: move some Draw checks out of validation
These checks do not generate any errors. Move them so we can add
KHR_no_error support and still make sure we do these checks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
63a14e9e14 mesa/varray: add KHR_no_error support to *Pointer() functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
d86dd5963e mesa/varray: add KHR_no_error support to some callers of validate_array_format()
The only caller we don't update is update_arrays(), we leave that to the
following commit.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
c495c2398c mesa/varray: rename update_array_format() -> validate_array_format()
We also move _mesa_update_array_format() into the caller.

This gets these functions ready for KHR_no_error support.

V2: Updated function comment as suggested by Brian.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
9e60742ddc mesa/varray: create get_array_format() helper
This will help us split array validation from array update.

V2: add const to ctx param

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
d0608c43c5 mesa/varray: split update_array() into validate_array() and update_array()
This will be used for adding KHR_no_error support.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
bd2662bfa1 mesa: add KHR_no_error support to glUniform*() functions
V2: restore lost comment, add static to validate_uniform(),
    simplify array offset logic.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
2c9ac0bc63 mesa: always return GL_OUT_OF_MEMORY or GL_NO_ERROR when KHR_no_error enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
3ff1fce6c9 mesa: add _mesa_is_no_error_enabled() helper
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:25 +10:00
Timothy Arceri
a0ed0eb342 mesa: add env var to force enable the KHR_no_error ctx flag
V2: typo know -> known
V3: add security check (Suggested by Nicolai)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:24 +10:00
Timothy Arceri
d42d150ad2 mesa: expose KHR_no_error
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 16:53:24 +10:00
Constantine Kharlamov
2a8a569276 r600g: update dirty_level_mask after the 1-st draw after FB change
Ported from radeonsi. Testing with Kane&Lynch2 shows ≈1k skipped updates per
frame on average.

No piglit changes with tests/gpu.py, gbm mode.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-19 08:15:22 +02:00
Nicolai Hähnle
51deba0eb3 vbo: fix gl_DrawID handling in glMultiDrawArrays
Fixes a bug in
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-19 08:11:07 +02:00
Nicolai Hähnle
42d5465b9b mesa: move glMultiDrawArrays to vbo and fix error handling
When any count[i] is negative, we must skip all draws.

Moving to vbo makes the subsequent change easier.

v2:
- provide the function in all contexts, including GLES
- adjust validation accordingly to include the xfb check
v3:
- fix mix-up of pre- and post-xfb prim count (Nils Wallménius)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-19 08:10:19 +02:00
Nicolai Hähnle
756e9ebbdd mesa: extract need_xfb_remaining_prims_check
The same logic needs to be applied to glMultiDrawArrays.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-19 08:09:57 +02:00
Nicolai Hähnle
ea9a8940ca mesa: fix remaining xfb prims check for GLES with multiple instances
Found by inspection.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-19 08:09:53 +02:00
Mike Lothian
2284d6bf7a radv/meta: Fix nir_builder.h include
This fixes the build after:

commit 399ebd2a84
Author: Dave Airlie <airlied@redhat.com>
Date:   Wed Apr 19 06:18:23 2017 +1000

    radv/meta: add common shader vertex generation function

Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 12:25:18 +10:00
Mike Lothian
709ed1fa9f radv/ac: Fix nir.h include
This fixes the build after:

commit 224cf2906a
Author: Dave Airlie <airlied@redhat.com>
Date:   Mon Apr 17 13:01:52 2017 +1000

    radv/ac: add initial pre-pass for shader info gathering

Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 12:25:18 +10:00
Dave Airlie
03a2ca6356 radv/meta: refactor out some common shaders.
The vs vertex generate and fs noop shaders are used in a few places,
so refactor them out.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:03:05 +10:00
Dave Airlie
bdd98d950f radv/meta: generate position for blit shaders.
This generates the position info using the vertex shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:03:01 +10:00
Dave Airlie
922f44d1ab radv/meta: reduce vertex buffer in blit2d.
Generate the position vertices.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:58 +10:00
Dave Airlie
dd17e4ceb4 radv/meta: reduce vertex buffer usage in clear shaders
For depth clears we have to pass the depth in the 2nd
component, we can use push constants for some of this
later to drop the vertex buffer completely

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:53 +10:00
Dave Airlie
84b9e3a831 radv/meta: avoid using vertex buffer for resolve shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:50 +10:00
Dave Airlie
3a7fd0c4db radv/meta: move depth decompress to using inline vertex data
This removes the vertex buffer, and just generates the values
in the shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:47 +10:00
Dave Airlie
90ed2872bc radv/meta: move fast clear to generate vertices in shader.
Avoids having to setup vertex buffers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:43 +10:00
Dave Airlie
399ebd2a84 radv/meta: add common shader vertex generation function
Instead of passing in the same 1.0, -1.0 combinations via
vertex buffers, we can just use vertex id to have the vertex
shader build them. This function introduces the generator
code needed, later patches will use this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:39 +10:00
Dave Airlie
0e6d532d32 radv/meta: add support for save/restore meta without vertex data.
Some of the shaders could just generate the vertex data in the
shader, so add helpers to allow us to move to doing that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 10:02:23 +10:00
Dave Airlie
60a93e11ba radv: drop debugging leftovers code in descriptor set patches.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:31:14 +10:00
Dave Airlie
fd420a7417 radv: add support for 32 descriptor sets.
This bumps the limit to the number of sets to 32, now that
we have proper support for it. It also uses 1u in a few places
to make things a bit safer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie
25a5ee391d radv/ac: add support for indirect access of descriptor sets.
We want to expose more descriptor sets to the applications,
but currently we have a 1:1 mapping between shader descriptor
sets and 2 user sgprs, limiting us to 4 per stage. This commit
check if we don't have enough user sgprs for the number of
bound sets for this shader, we can ask for them to be indirected.

Two sgprs are then used to point to a buffer or 64-bit pointers
to the number of allocated descriptor sets. All shaders point
to the same buffer.

We can use some user sgprs to inline one or two descriptor sets
in future, but until we have a workload that needs this I don't
 think we should spend too much time on it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie
d0991b135b radv: start allocating user sgprs
This adds an initial implementation to allocate the user
sgprs and make sure we don't run out if we try to bind
a bunch of descriptor sets.

This can be enhanced further in the future if we add
support for inlining push constants.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie
4087eaecd0 radv/ac: mark used descriptor sets in shader info.
This pre calculates the used descriptor sets.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie
0b62669c8d radv/ac: frag shader only needs ring offsets if sample positions enabled
mostly documenting things, since with modern llvm we always have the
spill enabled.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie
ec4785afb7 radv/ac: move needs_push_constants to shader info.
First step to optimising push constants.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie
ec15e0d301 radv: optimise compute shader grid size emission.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie
31174069d2 radv: start conditionalising vertex inputs. (v2)
In practice this will probably just drop draw id in a few places.

v2: just do draw_id for now. (Bas)
it might be possible to do something more if we need it in the
future. (nha)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie
224cf2906a radv/ac: add initial pre-pass for shader info gathering
There is some radv specific info we need to gather from shaders
before we get into converting nir->llvm, so we can make
better decisions especially around user sgpr allocation.

This is just an initial placeholder to gather if sample positions
are required in the frag shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Rob Clark
4299849ec7 freedreno: refactor dirty state handling
In particular, move per-shader-stage info out to a seperate array of
enum's indexed by shader stage.  This will make it easier to add more
shader stages as well as new per-stage state (like SSBOs).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
d7fa7f5e7e freedreno: move clear path dirty state hack to a2xx backend
a3xx/a4xx use the generic u_blitter path, which will make state dirty
bits be set appropriately thanks to the automagic of generic code
setting generic state in the driver.  And a5xx has a blit/dma engine
(actually, two) so it doesn't need these extra dirty bits set.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
b662f71d9c freedreno/ir3: split out per-stage emit_consts fxns
This makes it easier to deal with adding additional stages which have
their own driver-params.  The duplicated code this introduces can be
refactored out after a later patch moves to per-shader-stage dirty
flags.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
df37902e34 freedreno: add helper to mark all state clean
Note that this involves juggling around a bit when we emit and clear
texture state.  So split out from the patch that adds the helper to set
all state dirty.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
71f9e03d21 freedreno: add helper to mark all state dirty
This will simplify things when we break out per-shader-stage dirty bits.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
248a508f24 freedreno: move a2xx specific hack out of core
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
0cc23ae779 freedreno: make texture state an array
Make this an array indexed by shader stage, as is done elsewhere for
other per-shader-stage state.  This will simplify things as more shader
stages are eventually added.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
5845b20455 freedreno/ir3: refactor out helpers for comparing shader keys
Each of the ir3 users has *basically* the same logic for comparing the
previous and current shader key, to see which, if any, shader state
needs to be marked dirty due to shader variant change.

The difference between gen's was just that some lowering flags never get
set on certain generations.  But it doesn't really hurt to include the
extra checks (because both keys would have false).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-18 16:32:00 -04:00
Rob Clark
6fb7935ded util/queue: don't hang at exit
So atexit() is horrible and 4aea8fe7 is probably not a good idea.  But
add an extra layer of duct-tape to the problem.  Otherwise we hit a
situation where app using an atexit() handler that runs later than ours
doesn't hang when trying to tear down a context.

 (gdb) bt
 #0  util_queue_killall_and_wait (queue=queue@entry=0x52bc80) at ../../../src/util/u_queue.c:264
 #1  0x0000007fb6c380c0 in atexit_handler () at ../../../src/util/u_queue.c:51
 #2  0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6
 #3  0x0000007fb7730e5c in exit () from /lib64/libc.so.6
 #4  0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267
 #5  0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139
 #6  0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
 #7  0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
 #8  0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203
 #9  0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46
 (gdb) c
 Continuing.
 [Thread 0x7fb67580c0 (LWP 3471) exited]
 ^C
 Thread 1 "drawbuffer-mode" received signal SIGINT, Interrupt.
 0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
 (gdb) bt
 #0  0x0000007fb72dda34 in pthread_cond_wait@@GLIBC_2.17 () from /lib64/libpthread.so.0
 #1  0x0000007fb6c38304 in cnd_wait (mtx=0x5bdc90, cond=0x5bdcc0) at ../../../include/c11/threads_posix.h:159
 #2  util_queue_fence_wait (fence=0x5bdc90) at ../../../src/util/u_queue.c:106
 #3  0x0000007fb6daac70 in fd_batch_sync (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:233
 #4  batch_reset (batch=batch@entry=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:183
 #5  0x0000007fb6daa5e0 in batch_flush (batch=0x5bdc70) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:290
 #6  fd_batch_flush (batch=0x5bdc70, sync=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch.c:308
 #7  0x0000007fb6daba2c in fd_bc_flush (cache=0x461220, ctx=0x52b920) at ../../../../../src/gallium/drivers/freedreno/freedreno_batch_cache.c:141
 #8  0x0000007fb6dac954 in fd_context_flush (pctx=0x52b920, fence=0x0, flags=<optimized out>) at ../../../../../src/gallium/drivers/freedreno/freedreno_context.c:54
 #9  0x0000007fb6b43294 in st_glFlush (ctx=<optimized out>) at ../../../src/mesa/state_tracker/st_cb_flush.c:121
 #10 0x0000007fb69a84e8 in _mesa_make_current (newCtx=newCtx@entry=0x0, drawBuffer=drawBuffer@entry=0x0, readBuffer=readBuffer@entry=0x0) at ../../../src/mesa/main/context.c:1654
 #11 0x0000007fb6b7ca58 in st_api_make_current (stapi=<optimized out>, stctxi=0x0, stdrawi=0x0, streadi=0x0) at ../../../src/mesa/state_tracker/st_manager.c:827
 #12 0x0000007fb6cc87e8 in dri_unbind_context (cPriv=<optimized out>) at ../../../../../src/gallium/state_trackers/dri/dri_context.c:217
 #13 0x0000007fb6cc80b0 in driUnbindContext (pcp=0x5271e0) at ../../../../../../src/mesa/drivers/dri/common/dri_util.c:591
 #14 0x0000007fb7d1da08 in MakeContextCurrent (dpy=0x433380, draw=0, read=0, gc_user=0x0) at ../../../src/glx/glxcurrent.c:214
 #15 0x0000007fb7a8d5e0 in glx_platform_make_current () from /lib64/libwaffle-1.so.0
 #16 0x0000007fb7a894e4 in waffle_make_current () from /lib64/libwaffle-1.so.0
 #17 0x0000007fb7ef8c60 in piglit_wfl_framework_teardown (wfl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_wfl_framework.c:628
 #18 0x0000007fb7ef939c in piglit_winsys_framework_teardown (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:238
 #19 0x0000007fb7ef9c30 in destroy (gl_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:212
 #20 0x0000007fb7edb7c4 in destroy () at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:184
 #21 0x0000007fb7730e2c in __run_exit_handlers () from /lib64/libc.so.6
 #22 0x0000007fb7730e5c in exit () from /lib64/libc.so.6
 #23 0x0000007fb7ce17dc in piglit_report_result (result=PIGLIT_PASS) at /home/robclark/src/piglit/tests/util/piglit-util.c:267
 #24 0x0000007fb7ef99f8 in process_next_event (x11_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:139
 #25 0x0000007fb7ef9a90 in enter_event_loop (winsys_fw=0x432c20) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_x11_framework.c:153
 #26 0x0000007fb7ef8e50 in run_test (gl_fw=0x432c20, argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/util/piglit-framework-gl/piglit_winsys_framework.c:88
 #27 0x0000007fb7edb890 in piglit_gl_test_run (argc=1, argv=0x7ffffff588, config=0x7ffffff400) at /home/robclark/src/piglit/tests/util/piglit-framework-gl.c:203
 #28 0x0000000000401224 in main (argc=1, argv=0x7ffffff588) at /home/robclark/src/piglit/tests/bugs/drawbuffer-modes.c:46
 (gdb) r

Fixes: 4aea8fe7 ("gallium/u_queue: fix random crashes when the app calls exit()")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-18 16:32:00 -04:00
Eric Anholt
c1362e78ad vc4: Enable V3D 2.6.
This version of the chip is present on the Cygnus-based 911360 enterprise
phone platform.  It appears to be completely backwards compatible.
2017-04-18 13:21:40 -07:00
Samuel Pitoiset
a18ff34452 st/mesa: add st_convert_sampler()
Similar to st_convert_image(), will be useful for bindless. While
we are at it, rename convert_sampler() to convert_sampler_from_unit()
and make 'st' a const argument.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-18 21:42:01 +02:00
Bartosz Tomczyk
ca41ecf838 mesa/glthread: add async support to ARB_viewport_array functions
v2: fix attribute name, it is count_scale not scale_count

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-18 12:19:12 +02:00
Timothy Arceri
a63919f848 mesa: rename _mesa_add_renderbuffer* functions
These names make it easier to understand what is going on in
regards to references.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-18 10:01:55 +10:00
Nanley Chery
d9d793696b anv/cmd_buffer: Disable CCS on BDW input attachments
The description under RENDER_SURFACE_STATE::RedClearColor says,

   For Sampling Engine Multisampled Surfaces and Render Targets:
    Specifies the clear value for the red channel.
   For Other Surfaces:
    This field is ignored.

This means that the sampler on BDW doesn't support CCS.

Cc: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-17 16:47:38 -07:00
Lionel Landwerlin
d71efbe5f2 anv: blorp: flush memory after copy
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-17 14:45:57 -07:00
Grazvydas Ignotas
ba6c451390 radv: enable timestampComputeAndGraphics
Commit bfee9866 "radv: Use RELEASE_MEM packet for MEC timestamp query."
added WriteTimestamp handling for compute queues but forgot to flip
the flag.

Tested with DOOM (by me) and CTS (by Bas), but without verification
that these tests actually use timestamps on compute queues.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-17 21:21:35 +03:00
Rob Clark
d4601b0efc freedreno: fix crash if ctx torn down with no rendering
In this case, ctx->flush_queue would not have been initialized.

Fixes: 0b613c20 ("freedreno: enable draw/batch reordering by default")
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-17 14:00:05 -04:00
Rob Clark
15fe9b2347 freedreno/ir3: add 'high' register class
For compute shaders, we need to be able to allocate some "high"
registers (r48.x to r55.w).  (Possibly these are global to all threads
in a warp?)  Add a new register class to handle this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-17 14:00:05 -04:00
Rob Clark
3c5d309477 freedreno: extract helper for stage->sb for a4xx+
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-17 14:00:05 -04:00
Rob Clark
9567beab36 freedreno/{a4xx,a5xx}: switch to CP_LOAD_STATE4
The layout of CP_LOAD_STATE packet is slightly different on a4xx+.
Switch to the a4xx+ specific CP_LOAD_STATE4 to get the correct encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-17 14:00:05 -04:00
Rob Clark
dfdb1fed78 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-17 14:00:05 -04:00
Emil Velikov
9915753e63 configure.ac: print deprecation warning as needed
The warning should be printed only when one explicitly uses the
deprecated configure toggle.

Fixes: 7748c3f5eb ("configure.ac: deprecate --with-egl-platforms over
--with-platforms")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-17 15:07:44 +01:00
Emil Velikov
19aec22c75 docs: add news item and link release notes for 17.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 14:44:35 +01:00
Emil Velikov
89ef8750f0 docs: add sha256 checksums for 17.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 12434966eb)
2017-04-17 14:43:27 +01:00
Emil Velikov
d271401d61 docs: add release notes for 17.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 367bafc7c1)
2017-04-17 14:43:26 +01:00
Emil Velikov
36aea77cd7 docs: add 17.2.0-devel release notes template, bump version
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 14:31:41 +01:00
Emil Velikov
7748c3f5eb configure.ac: deprecate --with-egl-platforms over --with-platforms
Currently the former controls more than just EGL. With follow-up commits
we'll unwind and fix things so that one can build the different drivers
with said platform support.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 13:37:41 +01:00
Emil Velikov
de128c19ee configure: remove egl platforms check
The configure option is used by more than just EGL and with next commit
we'll rename it accordingly. Thus having the check will (and is atm)
incorrect.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 13:13:09 +01:00
Emil Velikov
618a7b984b travis: remove unneeded dri3/present proto requirement
Signed-off-by: Emil Velikov <emil.lvelikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-17 13:12:03 +01:00
Emil Velikov
291a9405a5 configure: remove unneeded dri3/present proto requirements
We are not using either of these. The respecive xcb packages are used
instead.

v2: Rebase, reword commit message.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-17 13:10:37 +01:00
Kyle Brenneman
ce562f9e3f EGL: Implement the libglvnd interface for EGL (v3)
The new interface mostly just sits on top of the existing library.

The only change to the existing EGL code is to split the client
extension string into platform extensions and everything else. On
non-glvnd builds, eglQueryString will just concatenate the two strings.

The EGL dispatch stubs are all generated. The script is based on the one
used to generate entrypoints in libglvnd itself.

v2: [Kyle]
 - Rebased against master.
 - Reworked the EGL makefile to use separate libraries
 - Made the EGL code generation scripts work with Python 2 and 3.
 - Change gen_egl_dispatch.py to use argparse for the command line arguments.
 - Assorted formatting and style cleanup in the Python scripts.

v3: [Emil Velikov]
 - Rebase
 - Remove separate glvnd glx/egl configure toggles

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 13:03:58 +01:00
Tapani Pälli
370df207ca android: add marshal_generated c and h files to generated sources
Fixes: efd63e2 ("mesa: Connect the generated GL command marshalling code to the build.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 12:20:06 +01:00
Emil Velikov
3bcef6aa24 configure.ac: honour --disable-libunwind if the .pc file is present
We should check the presence in order to determine if we should
[implicitly] set the CFLAGS/LIBS

v2: Drop spurious OMX hunk (Eric)

Cc: Eric Anholt <eric@anholt.net>
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-17 12:05:10 +01:00
Emil Velikov
39c3482205 docs: document the C++14 SWR requirement
Earlier commit bumped the requirement for the SWR driver.

v2: Fold the note with the LLVM 3.9 one (Tim)

Fixes: 3c52a7316a ("swr: [configure.ac/scons] require c++14")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-04-17 12:04:22 +01:00
Samuel Pitoiset
84ed2e1192 winsys/amdgpu: init buffer_indices_hashlist with memset()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-17 11:59:17 +02:00
Samuel Pitoiset
af612816bc winsys/amdgpu: simplify amdgpu_cs_add_buffer() a bit
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-17 11:59:17 +02:00
Kenneth Graunke
7c3b8ed878 i965/drm: Delete NULL check in brw_bo_unmap().
I accidentally moved the bo->bufmgr dereference above the NULL check
when cleaning up this code.

While passing NULL to free() is a common pattern...passing NULL to
unmap seems pretty bad.  You really ought to know whether you have
a buffer or not.  We don't want to paper over bugs like that.  So,
just drop the NULL check altogether.

CID: 1405006

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-16 22:58:23 -07:00
Kenneth Graunke
9b71709cb8 intel/decoder: Fix is_header_field starting condition.
Starting positions >= 32 are not part of the header, rather than >.

Caught by Coverity, which found that "bits <<= field->start" may shift
by 32, which has undefined behavior.

CID: 1404968

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-16 22:58:23 -07:00
Kenneth Graunke
6142c3e298 i965/drm: Remove dead return in brw_bo_busy()
If ret is 0, we return.  If ret is not 0, we return.  This is dead.

CID: 1405013 (Structurally dead code (UNREACHABLE))

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-16 22:58:22 -07:00
Mauro Rossi
8c79dbe94e android: amd/addrlib: trivial fix for gfx9 support
Fixes the following build error:

external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 'gfx9_gb_reg.h' file not found
         ^
1 error generated.

Fixes: 7f160ef "amd/addrlib: import gfx9 support"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-17 14:04:21 +10:00
Jason Ekstrand
4cf079f7f2 nir: Add GLSL_TYPE_[U]INT64 to some switch statements
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-16 20:14:42 -07:00
Marek Olšák
2769dadb0f gallium/radeon: always flush asynchronously and wait after begin_new_cs
This hides the overhead of everything in the driver after the CS flush and
before returning from pipe_context::flush.
Only microbenchmarks will benefit.

+2% FPS for glxgears.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
f05f0bb5cb radeonsi: remove local variable 'mod' from si_compile_tgsi_shader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
bd2cde0c25 radeonsi: add si_shader_selector::vs_needs_prolog
cleanup

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
777f305840 radeonsi: don't set VGT_GS_MODE as part of the GS state
The VS state sets it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
5438e39fae radeonsi: don't allow user indices with indirect draws
Not possible with GL and it will make future gallium rework easier.
(also it's something I wouldn't like to support)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
1c94d29984 radeonsi: merge two if (indirect) statements
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Marek Olšák
bdd6449769 radeonsi: don't mark non-dirty textures with CMASK as compressed
because the compression is skipped with non-dirty textures.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-17 01:22:11 +02:00
Bas Nieuwenhuizen
566f2ed571 docs: Document interaction Fixes tag and stable branches.
For the next time I forget.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-15 11:37:46 +02:00
Timothy Arceri
9f0dd85aa6 glsl: don't run the GLSL pre-processor when we are skipping compilation
This moves the hashing of shader source for the cache lookup to before
the preprocessor.  In our experience, shaders are unlikely to hash the
same after preprocessing if they didn't hash the same before, so we can
skip preprocessing for cache hits.

Improves Deus Ex start-up times with a warm cache from ~30 seconds to
~22 seconds.

Also fixes the leaking of state.

V2: fix indentation

v3: add the value of MESA_EXTENSION_OVERRIDE to the hash of the shader.

Tested-by (v2): Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-15 11:36:52 +10:00
Timothy Arceri
c2bc0aa7b1 glsl: delay optimisations on individual shaders when cache is available
Due to a max limit of 65,536 entries on the index table that we use to
decide if we can skip compiling individual shaders, it is very likely
we will have collisions.

To avoid doing too much work when the linked program may be in the
cache this patch delays calling the optimisations until link time.

Improves cold cache start-up times on Deus Ex by ~20 seconds.

When deleting the cache index to simulate a worst case scenario
of collisions in the index, warm cache start-up time improves by
~45 seconds.

V2: fix indentation, make sure to call optimisations on cache
fallback, make sure optimisations get called for XFB.

Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-15 11:36:44 +10:00
Jason Ekstrand
d2d6cf6c83 anv: Add the pci_id into the shader cache UUID
This prevents a user from using a cache created on one hardware
generation on a different one.  Of course, with Intel hardware, this
requires moving their drive from one machine to another but it's still
possible and we should prevent it.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Cc: mesa-stable@lists.freedesktop.org
2017-04-14 17:41:07 -07:00
Philipp Zabel
36f2101723 etnaviv: native fence fd support
This adds native fence fd support to etnaviv, similarly to commit
0b98e84e9b ("freedreno: native fence fd"), enabled for kernel
driver version 1.1 or later.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 01:47:18 +02:00
Francisco Jerez
96dfc014fd docs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+
v2 (Andreas Boll):
- Mark GL 4.1 as supported by i965/gen7+
- Mark GL_ARB_shader_precision as supported by i965/gen7+
- Update release notes

Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 16:13:21 -07:00
Juan A. Suarez Romero
1877982aca i965: enable OpenGL 4.2 in Ivybridge
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 16:13:21 -07:00
Samuel Iglesias Gonsálvez
92d4dc76ea i965: enable ARB_shader_precision in gen7+
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 16:13:21 -07:00
Juan A. Suarez Romero
0aed1212ae i965: enable ARB_vertex_attrib_64bit for gen7+
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 16:13:21 -07:00
George Kyriazis
b9d4256e11 swr: Fix swr osmesa build
Use GALLIUM_SWR to standardize

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-14 18:03:40 -05:00
Wladimir J. van der Laan
6a8d5ab932 etnaviv: SINGLE_BUFFER support on GC3000
This patch adds support for the SINGLE_BUFFER feature on GC3000
GPUs, which allows rendering to a single buffer using multiple pixel
pipes.

This feature is always used when it is available, which means that
multi-tiled formats are no longer being used in that case, and all
buffers will be normal (super)tiled. This mimics the behavior of the
blob on GC3000.

- Because the same format can be used to render to and texture from,
  this avoids an extra resolve pass when rendering to texture.

- i.MX6qp includes a PRE which can scan-out directly from tiled formats,
  avoiding untiling overhead.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 00:34:13 +02:00
Wladimir J. van der Laan
1dcb1d4925 etnaviv: Update includes from rnndb
Update to etna_viv commit 8486a97.

austriancoder: changed patch to include isa redefinition fix.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 00:34:08 +02:00
Wladimir J. van der Laan
9e4d049f40 etnaviv: Add chipMinorFeatures4 and 5
Request chipMinorFeatures bitfields 4 and 5 from the
drm driver.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 00:34:03 +02:00
Philipp Zabel
dda956340c etnaviv: resolve tile status when flushing resource
When passing render buffers from EGL clients to a wayland compositor,
the resource tile status must be resolved because otherwise the tile
status is lost in the transfer and cleared parts of the buffer will
contain old contents.

The same applies when sampling directly from a renderable resource.

lst: Add seqno tracking, to skip flush when not needed.

Fixes: aadcb5e94b35 ("etnaviv: enable TS, but disable autodisable")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 00:15:30 +02:00
Philipp Zabel
f30aab7696 etnaviv: stop repeatedly resolving an unchanged resource into its scanout prime buffer
Before resolving a resource into its scanout prime buffer, check that
the prime resource is actually older. If it is not, the resolve is an
expensive no-op, and we better skip it.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-15 00:15:27 +02:00
George Kyriazis
d7a1f01db3 swr: Add polygon stipple support
Add polygon stipple functionality to the fragment shader.

Explicitly turn off polygon stipple for lines and points, since we
do them using tris.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-14 17:08:12 -05:00
Samuel Iglesias Gonsálvez
8973ae3162 docs/relnotes: add GL_ARB_gpu_shader_fp64 support on i965/ivybridge
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez
ef49dda2df docs: mark GL_ARB_gpu_shader_fp64 and OpenGL 4.0 as supported by i965/gen7+
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez
a494afdb8e i965: enable OpenGL 4.0 to Ivybridge/Baytrail
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:10 -07:00
Samuel Iglesias Gonsálvez
cd0a6b2fc2 i965: enable ARB_gpu_shader_fp64 for Ivybridge/Baytrail
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Matt Turner
2eeb1b0ad9 i965: Use correct VertStride on align16 instructions.
In commit c35fa7a, we changed the "width" of DF source registers to 2,
which is conceptually fine. Unfortunately a VertStride of 2 is not
allowed by align16 instructions on IVB/BYT, and the regular VertStride
of 4 works fine in any case.

See generated_tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-round-double.shader_test
for example:

cmp.ge.f0(8)    g18<1>DF        g1<0>.xyxyDF    -g8<2>DF        { align16 1Q };
        ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed
cmp.ge.f0(8)    g19<1>DF        g1<0>.xyxyDF    -g9<2>DF        { align16 2N };
        ERROR: In Align16 mode, only VertStride of 0 or 4 is allowed

v2:
- Add spec quote (Curro).
- Change the condition to only BRW_VERTICAL_STRIDE_2 (Curro)

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez
d8441e2276 i965/vec4/dce: improve track of partial flag register writes
This is required for correctness in presence of multiple 4-wide flag
writes (e.g. 4-wide instructions with a conditional mod set) which
update a different portion of the same 8-bit flag subregister.

Right now we keep track of flag dataflow with 8-bit granularity and
consider flag writes to have killed any previous definition of the
same subregister even if the write was less than 8 channels wide,
which can cause live flag register updates to be dead
code-eliminated incorrectly.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez
c1fc8fad47 i965/vec4: don't do horizontal stride on some register file types
horiz_offset() shouldn't be doing anything for scalar registers,
because all channels of any SIMD instructions will end up reading or
writing the same component of the register, so shifting the register
offset would be wrong.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Re-implement in terms of is_uniform() for
  simplicity.  Pass argument by const reference.  Clarify commit
  message. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Matt Turner
21e8e3a848 i965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT.
Otherwise for a pack_double_2x32_split opcode, we emit:

   vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134
mov(8)          g5<1>UD         g5<4>.xUD                       { align16 1Q compacted };
mov(8)          g7<2>UD         g5<4,4,1>UD                     { align1 1Q };
        ERROR: When the destination spans two registers, the source must span two registers
               (exceptions for scalar source and packed-word to packed-dword expansion)
mov(8)          g8<2>UD         g5.4<4,4,1>UD                   { align1 2N };
        ERROR: The offset from the two source registers must be the same
mov(8)          g5<1>UD         g6<4>.xUD                       { align16 1Q compacted };
mov(8)          g7.1<2>UD       g5<4,4,1>UD                     { align1 1Q };
        ERROR: When the destination spans two registers, the source must span two registers
               (exceptions for scalar source and packed-word to packed-dword expansion)
mov(8)          g8.1<2>UD       g5.4<4,4,1>UD                   { align1 2N };
        ERROR: The offset from the two source registers must be the same

The intention was to emit mov(4)s for the instructions that have ERROR
annotations.

See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test
for example.

v2 (Samuel):
- Instead of setting the exec size to a fixed value, don't double it
(Curro).
- Add PICK_{HIGH,LOW}_32BIT to the condition.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Trivial rebase changes. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Samuel Iglesias Gonsálvez
f030aaf2fb i965/vec4: use vec4_builder to emit instructions in setup_imm_df()
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Drop useless vec4_visitor dependencies.  Demote to
  static stand-alone function.  Don't write unused components in the
  result.  Use vec4_builder interface for register allocation. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:09 -07:00
Juan A. Suarez Romero
a907c91e93 i965/vec4: consider subregister offset in live variables
Take into account offset values less than a full register (32 bytes)
when getting the var from register.

This is required when dealing with an operation that writes half of the
register (like one d2x in IVB/BYT, which uses exec_size == 4).

v2:
- Take in account this offset < 32 in liveness analysis too (Curro)

v3:
- Change formula in var_from_reg() (Curro)
- Remove useless changes (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Francisco Jerez
92649a3e67 i965/vec4: fix assert to detect SIMD lowered DF instructions in IVB
On IVB, DF instructions have lowered the SIMD width to 4 but the
exec_size will be later doubled. Fix the assert to avoid crashing in
this case.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Simplify assert.  Except for the 'inst->group % 4
  == 0' part the assertion was redundant with the previous assertion. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez
6e3265eae5 i965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's type
This way we can set the destination type as double to all these new opcodes,
avoiding any optimizer's confusion that was happening before.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Drop no_spill workaround originally needed due to
  the bogus destination type of VEC4_OPCODE_FROM_DOUBLE. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez
50a5217637 i965/vec4: split d2x conversion and data gathering from one opcode to two explicit ones
When doing a 64-bit to a smaller data type size conversion, the destination should
be aligned to 64-bits. Because of that, we need to gather the data after the
actual conversion.

Until now, these two operations were done by VEC4_OPCODE_FROM_DOUBLE but
now we split them explicitely in two different instructions:
VEC4_OPCODE_FROM_DOUBLE just do the conversion and
VEC4_OPCODE_PICK_LOW_32BIT will gather the data.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero
cfaf14a126 i965/vec4: fix VEC4_OPCODE_FROM_DOUBLE for IVB/BYT
In the generator we must generate slightly different code for
Ivybridge/Baytrail, because of the way the stride works in
this hardware.

v2:
- Use stride and don't need to fix dst (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero
be445d3ea3 i965/vec4: keep original type when dealing with null registers
Keep the original type when dealing with null registers. Especially
because we do no want to introduce an implicit conversion between
types that could affect the conditional flags.

This affects especially when the original type is DF, and we are working
on Ivybridge/Baytrail.

v2 (Curro)
- Fix typo.
- Use retype() instead of applying the type directly.
- Remove unneeded retype.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez
a21dc2b500 i965/vec4: split DF instructions and later double its execsize in IVB/BYT
We need to split DF instructions in two on IVB/BYT as it needs an
execsize 8 to process 4 DF values (one GRF in total).

v2:
- Rename helper and make it static inline function (Matt).
- Fix indention and add braces (Matt).

v3:
- Don't edit IR instruction when doubling exec_size (Curro)
- Add comment into the code (Curro).
- Manage ARF registers like the others (Curro)

v4:
- Add get_exec_type() function and use it to calculate the execution
  size.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Fix bogus 'type != BAD_FILE' check.  Take
  destination type as execution type where there is no valid source.
  Assert-fail if the deduced execution type is byte.  Clarify comment
  in get_lowered_simd_width().  Move SIMD width workaround outside of
  'if (...inst->size_written > REG_SIZE)' conditional block, since the
  problem should be independent of whether the amount of data written
  by the instruction is greater or lower than a GRF.  Drop redundant
  is_ivb_df definition.  Drop bogus inst->exec_size < 8 check.
  Simplify channel group assertion. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Samuel Iglesias Gonsálvez
a5399e8b1c i965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT
The hardware applies the same channel enable signals to both halves of
the compressed instruction which will be just wrong under non-uniform
control flow. Fix this by splitting those instructions to SIMD4.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:08 -07:00
Francisco Jerez
ebfb703d44 i965/fs: Get 64-bit indirect moves working on IVB.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-04-14 14:56:08 -07:00
Matt Turner
630b84cdc8 i965: Use source region <1,2,0> when converting to DF.
Doing so allows us to use a single MOV in VEC4_OPCODE_TO_DOUBLE instead
of two.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-04-14 14:56:08 -07:00
Juan A. Suarez Romero
3198ce3f96 i965/fs: fix lower SIMD width for IVB/BYT's MOV_INDIRECT
According to the IVB and HSW PRMs:

"2.When the destination requires two registers and the sources are
 indirect, the sources must use 1x1 regioning mode."

So for DF instructions the execution size is not limited by the number
of address registers that are available, but by the EU decompression
logic not handling VxH indirect addressing correctly.

This patch limits the SIMD width to 4 in this case.

v2:
- Fix typo (Matt).
- Fix condition (Curro)

v3:
- Add spec quote (Curro)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero
571cbd05eb i965/fs: fix dst stride in IVB/BYT type conversions
When converting a DF to 32-bit conversions, we set dst stride to 2,
to fulfill alignment restrictions because the upper Dword of every
Qword will be written with undefined value.

But in IVB/BYT, this is not necessary, as each DF conversion already
writes 2, the first one the real value, and the second one a 0.
That is, IVB/BYT already set stride = 2 implicitly, so we must set it to
1 explicitly to avoid ending up with stride = 4.

v2:
- Fix typo (Matt)

v3:
- Fix stride in the destination's brw_reg, don't modity IR (Curro)

v4:
- Remove 'is_dst' argument of brw_reg_from_fs_reg() (Curro)
- Fix comment (Curro).
- Relax hstride assert (Curro)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Minor spelling fixes. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez
af6fc3a8ea i965/fs: rename lower_d2x to lower_conversions
v2:
- Change the name to lower_conversions.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez
dee31311eb Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs."
This reverts commit 7dccd38b40.

d2x pass fixes SEL instructions when there is a type conversion
by doing a SEL without type conversion and then convert the result.
This pass also takes into account the non-uniform control flow.

Then, 7dccd38b40 is not needed anymore.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez
aeecc82d05 i965/fs: generalize the legalization d2x pass
Generalize it to lower any unsupported narrower conversion.

v2 (Curro):
- Add supports_type_conversion()
- Reuse existing intruction instead of cloning it.
- Generalize d2x to narrower and equal size conversions.

v3 (Curro):
- Make supports_type_conversion() const and improve it.
- Use foreach_block_and_inst to process added instructions.
- Simplify code.
- Add assert and improve comments.
- Remove redundant mov.
- Remove useless comment.
- Remove saturate == false assert and add support for saturation
  when fixing the conversion.
- Add get_exec_type() function.

v4 (Curro):
- Use get_exec_type() function to get sources' type.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Matt Turner
94ffeb7fa2 i965: Use <0,2,1> region for scalar DF sources on IVB/BYT.
On HSW+, scalar DF sources can be accessed using the normal <0,1,0>
region, but on IVB and BYT DF regions must be programmed in terms of
floats. A <0,2,1> region accomplishes this.

v2:
- Apply region <0,2,1> in brw_reg_from_fs_reg() (Curro).

v3:
- Added comment explaining the reason (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Samuel Iglesias Gonsálvez
82d17615f4 i965/fs: clamp exec_size when an instruction has a scalar DF source
Then the SIMD lowering pass will get rid of any compressed instructions with scalar
source (whether force_writemask_all or not) and we avoid hitting the Gen7 region
decompression bug.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero
0f1316d4db i965/fs: double regioning parameters and execsize for DF in IVB/BYT
In IVB and BYT, both regioning parameters and execution sizes are measured as
32-bits element size.

So when we have something like:

mov(8) g2<1>DF g3<4,4,1>DF

We are not actually moving 8 doubles (our intention), but 4 doubles.

We need to double the parameters to cope with this issue. However,
horizontal strides don't behave as they're supposed to on IVB
for DF regions, they will cause each 32-bit half of DF sources to be
strided individually, and doubling the value won't make any difference.

v2:
- Use devinfo directly (Matt).
- Use Baytrail instead of Valleview (Matt).
- Use IvyBridge instead of Ivy (Matt)
- Double the exec_size in code emission (Curro)

v3:
- Change hstride doubling by an assert and fix commit log (Curro).
- Substitute remaining compiler->devinfo by devinfo (Curro).

v4:
- Fix comment (Curro).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Juan A. Suarez Romero
79af256388 i965/fs: add helper to retrieve instruction execution type
The execution data size is the biggest type size of any instruction
operand.

We will use it to know if the instruction deals with DF, because in Ivy
we need to double the execution size and regioning parameters.

v2:
- Fix typo in commit log (Matt)
- Use static inline function instead of fs_inst's method (Curro).
- Define the result as a constant (Curro).
- Fix indentation (Matt).
- Add braces to nested control flow (Matt).

v3 (Curro):
- Add get_exec_type() and other auxiliary functions and use them to
  calculate its size.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
[ Francisco Jerez: Fix bogus 'type != BAD_FILE' check.  Fix deduced
  execution type for integer vector types.  Take destination type as
  execution type where there is no valid source.  Assert-fail if the
  deduced execution type is byte.  Move into brw_ir_fs.h header for
  consistency with the VEC4 back-end. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:07 -07:00
Matt Turner
fd349d29e4 i965: Handle IVB DF differences in the validator.
On IVB/BYT, region parameters and execution size for DF are in terms of
32-bit elements, so they are doubled. For evaluating the validity of an
instruction, we halve them.

v2 (Sam):
- Add comments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-04-14 14:56:07 -07:00
Iago Toral Quiroga
fbac8b1f94 i965/disasm: also print nibctrl in IVB for execsize=8
4-wide DF operations where NibCtrl applies require and execsize of 8
in IvyBridge/BayTrail.

v2:
- Refactor NibCtrl printing (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-14 14:56:06 -07:00
Boyan Ding
ff29f488d4 nir: Destination component count of shader_clock intrinsic is 2
This fixes the following error when using ARB_shader_clock on i965:
	vec1 32 ssa_0 = intrinsic shader_clock () () ()
	intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */
error: src->ssa->num_components == num_components (nir/nir_validate.c:204)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-04-14 14:54:06 -07:00
Nicolai Hähnle
39f51b5db9 radeonsi: add missing initialization for userptr buffers
Fix the accounting for memory usage of userptr buffers, which has been wrong
forever (or at least for a long time).

Also initialize flags. Without this initialization, the sparse buffer flag
might end up being set, which leads to staging buffers being used unnecessarily
(and incorrectly) in transfers to or from userptr buffers.

This works around VM faults that occur with the radeon kernel module when
running piglit ./bin/amd_pinned_memory decrement-offset map-buffer -auto

Fixes: e077c5fe65 ("gallium/radeon: transfers and invalidation for sparse buffers")
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-14 23:23:04 +02:00
Fredrik Höglund
c1dd5d0b01 radv: remove the temp descriptor set infrastructure
It is no longer used.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:21:24 +02:00
Fredrik Höglund
5ab5d1bee4 radv: use push descriptors in meta
Use push descriptors instead of temp descriptor sets.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:21:24 +02:00
Fredrik Höglund
f95caae504 radv: add private push descriptors for meta
This allows meta to use push descriptors without disturbing user
push descriptors.

radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR
in that partial updates are not supported; all descriptors used in
subsequent draw commands must be pushed at the same time.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:21:24 +02:00
Jason Ekstrand
220974b38d anv/blorp: Properly handle VK_ATTACHMENT_UNUSED
The Vulkan driver was originally written under the assumption that
VK_ATTACHMENT_UNUSED was basically just for depth-stencil attachments.
However, the way things fell together, VK_ATTACHMENT_UNUSED can be used
anywhere in the subpass description.  The blorp-based clear and resolve
code has a bunch of places where we walk lists of attachments and we
weren't handling VK_ATTACHMENT_UNUSED everywhere.  This commit should
fix all of them.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-04-14 14:20:42 -07:00
Jason Ekstrand
21d2ca72d8 anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSED
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-04-14 14:20:42 -07:00
Jason Ekstrand
02eca8b6f8 anv/cmd_buffer: Always set up a null surface state
We're about to start requiring it in yet another case and calculating
exactly when one is needed is starting to get prohibitively expensive.
A single surface state doesn't take up that much space so we may as well
create one all the time.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
2017-04-14 14:20:42 -07:00
Nicolai Hähnle
d6588d9962 radeonsi: cope with missing disassembly
For robustness and testing purposes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-14 22:51:07 +02:00
Nicolai Hähnle
d15b1f6e2d gallium/ddebug: dump missing members of pipe_draw_info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-14 22:50:54 +02:00
Nicolai Hähnle
2ac03e90fb radeonsi: enable ARB_shader_viewport_layer_array
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-14 22:50:17 +02:00
Nicolai Hähnle
d5e53f348e radeonsi: handle ignored LAYER and VIEWPORT_INDEX writes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-14 22:50:13 +02:00
Nicolai Hähnle
4127f38bae st/mesa: enable ARB_shader_viewport_layer_array
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-14 22:50:09 +02:00
Nicolai Hähnle
f3d2cf6c1f tgsi: clarify TGSI_SEMANTIC_{LAYER,VIEWPORT_INDEX}
Depending on pipe caps they can be writable in all vertex processing
stages, but only the output of the last stage counts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-14 22:50:06 +02:00
Nicolai Hähnle
17f24a9b75 gallium: add PIPE_CAP_TGSI_TES_LAYER_VIEWPORT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-14 22:49:44 +02:00
Nicolai Hähnle
8b5d477aa8 configure.ac: add --enable-sanitize option
Enable code sanitizers by adding -fsanitize=$foo flags for the compiler
and linker.

In addition, this also disables checking for undefined symbols: running
the address sanitizer requires additional symbols which should be provided
by a preloaded libasan.so (preloaded for hooking into malloc & friends
globally), and the undefined symbols check gets tripped up by that.

Running the tests works normally via `make check`, but shows additional
failures with the address sanitizer due to memory leaks that seem to be
mostly leaks in the tests themselves. I believe those failures should
really be fixed. In the mean-time, you can set

export ASAN_OPTIONS=detect_leaks=0

to only check for more serious error types.

v2:
- fail reasonably when an unsupported sanitize flag is given (Eric Engestrom)

Reviewed-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-14 22:44:30 +02:00
Jason Ekstrand
e1f6fb8021 anv/cmd_buffer: Flush the VF cache at the top of all primaries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-14 13:35:02 -07:00
Jason Ekstrand
939337e49f anv/blorp: Flush the texture cache in UpdateBuffer
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-14 13:35:02 -07:00
Jason Ekstrand
475bab0330 anv: Limit VkDeviceMemory objects to 2GB
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-04-14 13:35:02 -07:00
Jason Ekstrand
4495b917e2 intel/blorp: Add a blorp_emit_dynamic macro
This makes it much easier to throw together a bit of dynamic state.  It
also automatically handles flushing so you don't accidentally forget.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-14 13:35:02 -07:00
Bruce Cherniak
1832ef6cd9 swr: Enable MSAA in OpenSWR software renderer
This patch enables multisample antialiasing in the OpenSWR software renderer.

MSAA is a proof-of-concept/work-in-progress with bug fixes and performance
on the way.  We wanted to get the changes out now to allow several customers
to begin experimenting with MSAA in a software renderer.  So as not to
impact current customers, MSAA is turned off by default - previous
functionality and performance remain intact.  It is easily enabled via
environment variables, as described below.

It has only been tested with the glx-lib winsys.  The intention is to
enable other state-trackers, both Windows and Linux and more fully support
FBOs.

There are 2 environment variables that affect behavior:

* SWR_MSAA_FORCE_ENABLE - force MSAA on, for apps that are not designed
  for MSAA... Beware, results will vary.  This is mainly for testing.

* SWR_MSAA_MAX_SAMPLE_COUNT - sets maximum supported number of
  samples (1,2,4,8,16), or 0 to disable MSAA altogether.
  (The default is currently 0.)

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-04-14 15:22:45 -05:00
Bruce Cherniak
91a7f0b3af swr: Removed unnecessary PIPE_BIND flags from swr_is_format_supported
Removed unnecessary and probably wrong PIPE_BIND_SCANOUT and PIPE_BIND_SHARED
flags in favor of check on single PIPE_BIND_DISPLAY_TARGET flag.

Reference llvmpipe change <bee4c7718a3bd57e3d99f0913d9081cd13fe5fd>

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-04-14 15:22:44 -05:00
Bruce Cherniak
97bbb7b6a3 swr: Align swr_context allocation to SIMD alignment.
The context now contains SIMD vectors which must be aligned (specifically
samplePositions in the rastState in the derived state).  Failure to align
can result in segv crash on unaligned memory access in vector
instructions.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-04-14 15:22:44 -05:00
Tim Rowley
4dcfa83114 swr: update gallium driver docs
v2: add back scons section, mention additional built swr libraries

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-14 15:21:31 -05:00
Grazvydas Ignotas
bffdb434b7 radv: remove irrelevant comment
A leftover from anv.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:16:03 +03:00
Grazvydas Ignotas
1b2fe7ce45 radv: report timestampPeriod correctly
The kernel returns frequency in kHz, so to convert to nanosecond
interval that Vulkan uses the dividend should be 1000000.0 and not
100000.0.

This fixes the GPU graph in DOOM and matches the amdgpu-pro blob.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:15:55 +03:00
Rob Clark
9fc3e7137a nir/print: add compute shader info
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-14 12:46:12 -04:00
Rob Clark
16d493f1e7 gallium/docs: small correction about register files for atomics
These can operate on MEMORY[], in addition to BUFFER[] and IMAGE[]

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-14 12:46:12 -04:00
Rob Clark
0b613c20aa freedreno: enable draw/batch reordering by default
Probably should have flipped the switch a long time ago, since it
doesn't seem to cause any problems and is a nice perf boost in a number
of cases.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-14 12:46:12 -04:00
Rob Clark
b5cc88af5e freedreno/ir3: small re-order
Small re-order of switch statement to handled op-code categories in
order.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-14 12:46:12 -04:00
Rob Clark
75afd2586f freedreno/ir3: move 'keeps' to block level
For things like SSBOs and atomics we'll want to track this at a block
level.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-14 12:46:12 -04:00
Rob Clark
331bd3b5e1 freedreno/ir3: convert dynamic arrays to ralloc
Want to move one of these under ir3_block, so that gives a reason to
migrate the remaining malloc/realloc to ralloc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-14 12:46:12 -04:00
George Kyriazis
870760e02e swr: add linux to scons build
Make swr compile for both linux and windows.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-04-14 10:59:46 -05:00
Bas Nieuwenhuizen
e20eb91e2b radv: make sizes & offsets 32 bit in radv_descriptor_update_template_entry.
v2: Also convert the calculations.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-04-14 14:14:07 +02:00
Kenneth Graunke
7c83d44d54 docs: Update MESA_shader_integer_functions spec to version 3.
When publishing this spec on the OpenGL ES registry, Jon Leech noticed
that it didn't actually mention what the ES dependencies and
interactions were.  I looked at extensions_table.h and noted that we
expose it in ES 3.0 contexts, and he added the obvious spec texts.

The updated copy also contains our official extension number.

https://github.com/KhronosGroup/OpenGL-Registry/issues/3

Acked-by: Matt Turner <mattst88@gmail.com>
2017-04-13 23:01:27 -07:00
Bas Nieuwenhuizen
17a75b4da4 radv: Set descriptor set limits.
Properly and with comments this time.

Signed-off-by: Bas Nieuwenhuizen <bansi@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 22:55:11 +02:00
Bas Nieuwenhuizen
24ccf1a8b6 radv: Increase integer sizes in descriptor sets.
Needed if we want to allow them taking more than 64 KiB. The calculations
of these already used 32 bits.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 22:55:11 +02:00
Dave Airlie
58dd57cb94 radv: support S8_UINT as a depth/stencil format.
This enables a bunch of NotSupported CTS tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-14 05:49:25 +10:00
Dave Airlie
16b2dc0ca1 radv: bump maxGeometryShaderInvocations.
This bumps it to the same level as amdgpu-pro, it also
moves a bunch of dEQP-VK.geometry.instanced.* from
NotSupported to Pass.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-14 05:49:14 +10:00
Axel Davy
442780ea37 st/nine: Fix support for ps 1.4 dw and dz modifiers
RCP was used incorrectly to support NINED3DSPSM_DW and
NINED3DSPSM_DZ. src.x was used as input instead of src.w
or src.z.

Fixes: https://github.com/iXit/Mesa-3D/issues/271

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-04-13 20:05:03 +02:00
Jan Vesely
d8ffe4d0ce clover: Add missing include to compat header
Fixes build failure with LLVM 4

Fixes: a981e68c26
	(clover: Fix build against clang SVN >= r299965)

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-13 13:34:21 -04:00
Nicolai Hähnle
b52721e3b6 gallium/radeon: never use staging buffers with AMD_pinned_memory
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-04-13 17:36:26 +02:00
Nicolai Hähnle
4f7e3fbb50 radeonsi: fix gl_BaseVertex in non-indexed draws
gl_BaseVertex is supposed to be 0 in non-indexed draws. Unfortunately, the
way they're implemented, the VGT always generates indices starting at 0,
and the VS prolog adds the start index.

There's a VGT_INDX_OFFSET register which causes the VGT to start at a
driver-defined index. However, this register cannot be written from
indirect draws.

So fix this unlikely case by setting a bit to tell the VS whether the
draw is indexed or not, so that gl_BaseVertex can be adjusted accordingly
when used.

Fixes a bug in
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:31:11 +02:00
Nicolai Hähnle
472c84d1ad radeonsi: provide VS_STATE input to all VS variants
v2: fix incorrect change in get_tcs_out_patch_stride

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:20 +02:00
Nicolai Hähnle
3b9fbcb3b6 radeonsi: change the bit-packing of LS out/TCS in data
Avoid conflicts when merging various VS state bits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:19 +02:00
Nicolai Hähnle
ff39f0d59c radeonsi: emit VS_STATE register explicitly from si_draw_vbo
We will merge other derived state information into this register.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:18 +02:00
Nicolai Hähnle
8c224d3d9f radeonsi: extract derived tess state emit to higher level
Especially with subsequent changes, this makes it easier to see the
sequence of state emits at the higher level.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:17 +02:00
Nicolai Hähnle
215ceb37b9 radeonsi: drop support for TGSI_SEMANTIC_VERTEXID_NOBASE
It is unused.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 17:30:11 +02:00
Bas Nieuwenhuizen
4f7fb25d4e radv: Add more trace points.
Most trace points happen after an operation, so add a trace point
at the start of the command buffer.

Furthermore, add one after a CmdUpdateBuffer using CP_DMA as that
didn't emit one yet.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 16:06:47 +02:00
Bas Nieuwenhuizen
8a535a8bc0 radv: Ignore CmdUpdateBuffer with size 0.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 16:06:34 +02:00
Bas Nieuwenhuizen
04c7452d0c radv: Enable query inheritance.
timestamp and pipeline_statistics only do something on begin & end,
so they don't need any action.

Occlusion queries only do something to enable/disable and that
register is set nowhere else so that doesn't need extra support either.
(We technically should fix it to update the reg with the number of
 samples, but that hasn't happened yet, so we only change it to
 enable/disable counting)

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 16:04:27 +02:00
Bas Nieuwenhuizen
c3f38c8968 radv: enable variableMultisampleRate.
This is only relevant with 0 attachments. In that case we do nothing
on subpass switch already, and the pipeline is the authoritative
source of the number of samples, so this shouldn't change anything.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 15:48:14 +02:00
Edmondo Tommasina
5589fd89e1 gallium/hud: set the dump file streams to line buffered
Flush the HUD value streams to the dump files after every newline.

v2: check that fopen succeeded  (Julien)

Reviewed-and-Tested-by: Julien Isorce <jisorce@oblong.com>
2017-04-13 12:38:49 +01:00
Dave Airlie
01d0c5a922 radv: fix stencil regression since new addrlib import
The addrlib import meant we'd return after we attempted
to setup the no stencil bits for an S8_UINT, now we break
and use the stencil level info when creating stencil DB
info.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-13 20:32:03 +10:00
Dave Airlie
4bcebe10ca radv: allocate thin textures as linear.
This is ported from radeonsi, and avoids the bug in the
addrlib code. This should probably be something addrlib
does for us, but for now this fixes the regression without
changing addrlib and aligns us with radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-13 20:31:38 +10:00
Samuel Pitoiset
9ced105a52 i965: add missing ir_unop_*/ir_binop_* in visit_leave()
Fixes the following Clang warnings.

brw_fs_channel_expressions.cpp:219:12: warning: enumeration values 'ir_unop_ballot', 'ir_unop_read_first_invocation', and 'ir_binop_read_invocation' not handled in switch [-Wswitch]
   switch (expr->operation) {
           ^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-13 10:06:07 +02:00
Samuel Pitoiset
b6b566b48e st/mesa: fix wrong comparison in update_framebuffer_state()
state_tracker/st_atom_framebuffer.c:208:27: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare]
   if (framebuffer->width == UINT_MAX)
       ~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~
state_tracker/st_atom_framebuffer.c:210:28: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare]
   if (framebuffer->height == UINT_MAX)
       ~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~
2 warnings generated.

Fixes: eb0fd0e5f8 ("gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:06 +02:00
Samuel Pitoiset
a18bd1373b radeon: fix duplicate 'const' specifier
Fixes the following Clang warning.

In file included from radeon_debug.c:32:
./radeon_common_context.h:500:19: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
extern const char const *radeonVendorString;

v2: - do not remove the duplicate 'const' qualifier, fix it

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-13 10:06:06 +02:00
Samuel Pitoiset
ede273458c svga: remove unused vmw_dri1_intersect_src_bbox()
Fixes the following Clang warning.

vmw_screen_dri.c:130:1: warning: unused function 'vmw_dri1_intersect_src_bbox' [-Wunused-function]
vmw_dri1_intersect_src_bbox(struct drm_clip_rect *dst,
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:05 +02:00
Samuel Pitoiset
fbe2ff7740 llvmpipe: remove unused subpixel_snap() and fixed_to_float()
Fixes the following Clang warnings.

lp_setup_tri.c:55:1: warning: unused function 'subpixel_snap' [-Wunused-function]
subpixel_snap(float a)
^
lp_setup_tri.c:61:1: warning: unused function 'fixed_to_float' [-Wunused-function]
fixed_to_float(int a)
^

v2: - do not remove subpixel_snap() (use !PIPE_ARCH_SSE instead)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-13 10:06:05 +02:00
Samuel Pitoiset
12647533fa softpipe: remove unused sp_exec_fragment_shader()
Fixes the following Clang warning.

sp_fs_exec.c:56:1: warning: unused function 'sp_exec_fragment_shader' [-Wunused-function]
sp_exec_fragment_shader(const struct sp_fragment_shader_variant *var)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:04 +02:00
Samuel Pitoiset
5fbe99ce9f softpipe: remove unused quad_shade_stage()
Fixes the following Clang warning.

sp_quad_fs.c:60:1: warning: unused function 'quad_shade_stage' [-Wunused-function]
quad_shade_stage(struct quad_stage *qs)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:04 +02:00
Samuel Pitoiset
b885488c22 softpipe: remove unused get_texel_quad_2d()
Fixes the following Clang warning.

sp_tex_sample.c:802:1: warning: unused function 'get_texel_quad_2d' [-Wunused-function]
get_texel_quad_2d(const struct sp_sampler_view *sp_sview,
^
  CC       sp_tile_cache.lo
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:04 +02:00
Samuel Pitoiset
81ba57f463 trace: remove some unused trace_dump_tag*() functions
Fixes the following Clang warnings.

tr_dump.c:137:1: warning: unused function 'trace_dump_tag' [-Wunused-function]
trace_dump_tag(const char *name)
^
tr_dump.c:168:1: warning: unused function 'trace_dump_tag_begin2' [-Wunused-function]
trace_dump_tag_begin2(const char *name,
^
tr_dump.c:187:1: warning: unused function 'trace_dump_tag_begin3' [-Wunused-function]
trace_dump_tag_begin3(const char *name,
^
  CC       tr_texture.lo
3 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 10:06:04 +02:00
Samuel Pitoiset
c53a120a46 draw: remove unused wideline_stage()
Fixes the following Clang warning.

draw/draw_pipe_wide_line.c:48:38: warning: unused function 'wideline_stage' [-Wunused-function]
static inline struct wideline_stage *wideline_stage( struct draw_stage *stage )
                                     ^
1 warning generated.

v2: - remove commented code (Roland Scheidegger)
v3: - remove half_line_width in the struct

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-13 10:05:59 +02:00
Samuel Pitoiset
4dfe38aa9c draw: remove unused overflow()
Fixes the following Clang warning.

draw/draw_pipe_vbuf.c:102:1: warning: unused function 'overflow' [-Wunused-function]
overflow( void *map, void *ptr, unsigned bytes, unsigned bufsz )
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 09:58:52 +02:00
Samuel Pitoiset
18844005ec mesa: remove some unused functions in the perf monitor area
Fixes the following Clang warnings.

main/performance_monitor.c:157:1: warning: unused function 'index_to_queryid' [-Wunused-function]
index_to_queryid(GLuint index)
^
main/performance_monitor.c:163:1: warning: unused function 'queryid_valid' [-Wunused-function]
queryid_valid(const struct gl_context *ctx, GLuint queryid)
^
main/performance_monitor.c:169:1: warning: unused function 'counterid_to_index' [-Wunused-function]
counterid_to_index(GLuint counterid)
^
3 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 09:58:24 +02:00
Samuel Pitoiset
df2dba558c mesa: remove unused clamp_float_to_uint() and clamp_half_to_uint()
Fixes the following Clang warnings.

main/pack.c:470:1: warning: unused function 'clamp_float_to_uint' [-Wunused-function]
clamp_float_to_uint(GLfloat f)
^
main/pack.c:477:1: warning: unused function 'clamp_half_to_uint' [-Wunused-function]
clamp_half_to_uint(GLhalfARB h)
^
2 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 09:58:24 +02:00
Samuel Pitoiset
bdb53e240b mesa: remove unused _mesa_unmarshal_BindBufferBase()
Fixes the following Clang warning.

main/marshal.c:209:1: warning: unused function '_mesa_unmarshal_BindBufferBase' [-Wunused-function]
_mesa_unmarshal_BindBufferBase(struct gl_context *ctx, const struct marshal_cmd_BindBufferBase *cmd)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-13 09:58:19 +02:00
Samuel Pitoiset
b3375800d7 virgl: add missing PIPE_CAP_DOUBLES
Fixes the following Clang warning.

virgl_screen.c:60:12: warning: enumeration value 'PIPE_CAP_DOUBLES' not handled in switch [-Wswitch]
   switch (param) {
           ^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-13 09:58:05 +02:00
Samuel Pitoiset
d5cd4990cd glsl: simplify apply_image_qualifier_to_variable()
This removes one level of indentation and will improve readability
for bindless images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-13 09:52:55 +02:00
Samuel Pitoiset
6bb0f75bb6 glsl: add validate_fragment_flat_interpolation_input()
Requested by Timothy Arceri.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-13 09:52:48 +02:00
Boyan Ding
d02829c94e nvc0: Enable ARB_shader_ballot on Kepler+
readInvocationARB() and readFirstInvocationARB() need SHFL.IDX
instruction which is introduced in Kepler.

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:17 -04:00
Boyan Ding
59f6aa8096 nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*
v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:14 -04:00
Boyan Ding
48d00779d0 nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:08 -04:00
Boyan Ding
f7787f224f nvc0/ir: Add SV_LANEMASK_* system values.
v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:25:04 -04:00
Boyan Ding
2a3c4c6bc3 nvc0/ir: Allow 0/1 immediate value as source of OP_VOTE
Implementation of readFirstInvocationARB() on nvidia hardware needs a
ballotARB(true) used to decide the first active thread. This expressed
in gm107 asm as (supposing output is $r0):
	vote any $r0 0x1 0x1

To model the always true input, which corresponds to the second 0x1
above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and
"not 0x1" in the src field respectively.

v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: (Ilia Mirkin)
Make the handling more symmetric with predicate version in gm107
Use i->getSrc(s)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:59 -04:00
Boyan Ding
f1252996f5 gk110/ir: Emit OP_SHFL
v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: Check the range of immediate in OP_SHFL (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:55 -04:00
Boyan Ding
c32e150008 nvc0/ir: Emit OP_SHFL
v2: (Samuel Pitoiset)
Add an assertion to check if the target is Kepler
Make sure that asImm() is not NULL

v3: (Ilia Mirkin)
Check the range of immediate value of OP_SHFL
Use the new setPDSTL API

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:52 -04:00
Boyan Ding
d941ef3829 nvc0/ir: Properly handle a "split form" of predicate destination
GF100's ISA encoding has a weird form of predicate destination where its
3 bits are split across whole the instruction. Use a dedicated setPDSTL
function instead of original defId which is incorrect in this case.

v2: (Ilia Mirkin)
Change API of setPDSTL() to handle cases of no output
Fix setting of the highest bit in setPDSTL()

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:47 -04:00
Boyan Ding
854554c314 gm107/ir: Emit third src 'bound' and optional predicate output of SHFL
v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's
    lowering (Samuel Pitoiset)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-13 02:24:30 -04:00
Michel Dänzer
a981e68c26 clover: Fix build against clang SVN >= r299965
clang::LangAS::Offset is gone, the behaviour is as if it was 0.

v2: Introduce and use clover::llvm::compat::lang_as_offset (Francisco
    Jerez)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-04-13 12:51:24 +09:00
Brian Paul
46f49d6fdc st/mesa: add some _mesa_is_winsys_fbo() assertions
A few functions related to FBOs/renderbuffers should only be used with
window-system buffers, not user-created FBOs.  Assert for that.
Add additional comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-12 21:13:23 -06:00
Brian Paul
c36d224921 st/mesa: minor optimization in st_DrawBuffers()
We only do on-demand renderbuffer allocation for window-system FBOs,
not user-created FBOs.  So put the loop inside a conditional.

Plus, add some comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-12 21:13:23 -06:00
Timothy Arceri
fbcd709a34 mesa/st: only update samplers for stages that have changed
Might help reduce cpu for some apps that use sso.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-13 12:08:31 +10:00
Vinson Lee
f30f575e7b st/mesa: Fix missing-braces warning.
CXX      state_tracker/st_glsl_to_nir.lo
state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces]
      nir_lower_wpos_ytransform_options wpos_options = {0};
                                                        ^
                                                        {}

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-12 15:43:30 -07:00
Alex Smith
4603bea1aa radv: Disable primitive restart for non-indexed draws
According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.

Fixes corruption of the credits text in Mad Max.

v2: Reset primitive restart state after executing a secondary command
    buffer.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-12 20:58:41 +02:00
Matt Turner
ab18578b03 anv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR defined 2017-04-12 11:00:39 -07:00
Marek Olšák
f7b1371d2d Revert "r600g: get rid of dummy pixel shader"
This reverts commit 61e47d92c5.

It causes a hang on RS780.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100663
2017-04-12 17:46:21 +02:00
Bartosz Tomczyk
bb847e78cf mesa: fix memory leak in arb_fragment_program
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-12 17:50:36 +10:00
Bas Nieuwenhuizen
c4d43388c0 radv: Hash the immutable samplers.
Since the shader code can include them.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:43:38 +02:00
Bas Nieuwenhuizen
bd91caf863 radv: Use an offset instead of pointers for immutable samplers.
Makes more sense when we hash the layout for the pipeline cache.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:43:25 +02:00
Bas Nieuwenhuizen
b35b5951fc radv: Stop shadowing the result in radv_GetQueryPoolResults.
The outer result was referred to, which meant bugs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen
0763453291 radv: Return VK_NOT_READY if the query results are not available.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302 ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen
2dacb727c2 radv: Set query availability bit even if we don't wait.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302 ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2017-04-12 07:38:58 +02:00
Gregory Hainaut
03d1de387e mesa: avoid NULL ptr in prog parameter name
Context: _mesa_add_parameter is sometimes[0] called with a
NULL name as a mean of an unnamed parameter.

Allowing NULL pointer as a name means that it must be NULL checked
each access. So far it isn't always[1] true.

Parameter name is only used for debug purpose (printf) and
to lookup the index/location of the program by the application.

Conclusion, there is no valid reason to use a NULL pointer instead of
an empty string. So it was decided to use an empty string which avoid all
issues related to NULL pointer

[0]: texture gather offsets glsl opcode and st_init_atifs_prog
[1]: at least shader cache, st_nir_lookup_parameter_index and some printfs

Issue found by piglit 'texturegatheroffsets' tests on Nouveau

v4: new patch based on Nicolai/Timothy/ilia discussion
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-12 14:30:28 +10:00
Kenneth Graunke
754b961f38 i965/drm: Use bools for a few flags.
These one bit values are booleans.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
44ecbbebe2 i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit.
unsigned long is a terrible type for a bitfield - if you need fewer
than 32 bits, it wastes 4 bytes.  If you need more, things break on
32-bit builds.  Just use unsigned.

Even that's a bit ridiculous as we only have one flag today.
Still, it's at least somewhat better.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
f374b9449e i965/drm: Make BO size a uint64_t rather than unsigned long.
The drm_i915_gem_create ioctl structure uses a __u64 for the size,
so we should probably use uint64_t to match.  In theory, we could
probably have a BO larger than 4GB, using a 48-bit PPGTT - it just
wouldn't be mappable in the CPU's 32-bit address space.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
c85d6832fd i965/drm: Make alignment parameter a uint64_t.
Theoretically, with a 48-bit address space, we could have buffers
with an alignment of >= 4GB.  It's a bit silly, but the exec_object
structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may
as well use the same type as the kernel API.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
444ab8126d i965/drm: Make stride/pitch a uint32_t.
struct drm_i915_gem_set_tiling's stride field is a __u32.
intel_mipmap_tree::stride is a uint32_t.  Using unsigned long just
doesn't make sense.  Switching also lets us drop many pointless
locals that only existed to deal with the type mismatch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
14fc188460 i965/drm: Fix types for pwrite/pread fields.
The ioctl structs contain __u64 offset and size fields, so make them
uint64_t rather than unsigned long.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Kenneth Graunke
193601311c i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer.
For some reason we passed tiling by pointer, through several layers,
even though the functions only read the initial value, and never
actually change it.  We even had a do-while loop that executed until
the tiling mode matched - except it always did, so it only ran once.
We then had bogus error handling in case it changed the tiling mode
to something nonsensical...which it never did.

Drop all this nonsense.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 21:07:45 -07:00
Timothy Arceri
9bd7184078 mesa/st: remove _mesa_get_fallback_texture() calls
These calls look like leftover from fallback texture support first
being added to the st in 8f6d9e12be and then later being added
to core mesa in 00e203fe17.

The piglit test fp-incomplete-tex continues to work with this
change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-12 12:00:35 +10:00
Timothy Arceri
c72170fb1f mesa: use pre_hashed version of search for the mesa hash table
The key is just an unsigned int so there is never any real hashing
done.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-04-12 12:00:35 +10:00
Tim Rowley
d0f381f865 swr: [rasterizer core] Disable 8x2 tile backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
31a23a9d9d swr: [rasterizer common] Add _simd_testz_si alias
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
7abd1f9b24 swr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
54d11b3c95 swr: [rasterizer jitter] Remove unused function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
af909c0200 swr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
973d38801d swr: [rasterizer common/core] Fix 32-bit windows build
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
217b791a44 swr: [rasterizer core] Fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
da7aa39f93 swr: [rasterizer core] Code formating change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
c8cc07ca25 swr: [rasterizer core] SIMD16 Frontend WIP - PA
Fix PA NextPrim for SIMD8 on SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
08a7136848 swr: [rasterizer core] SIMD16 Frontend WIP - Clipper
Implement widened clipper for SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
0033e86b2c swr: [rasterizer core] Multisample sample position setup change
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Tim Rowley
4c093869db swr: [rasterizer core] Reduce templates to speed compile
Quick patch to remove some unused template params to cut down
rasterizer compile time.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-11 18:01:03 -05:00
Francisco Jerez
147e71242c i965/fs: Take into account lower frequency of conditional blocks in spilling cost heuristic.
The individual branches of an if/else/endif construct will be executed
some unknown number of times between 0 and 1 relative to the parent
block.  Use some factor in between as weight while approximating the
cost of spill/fill instructions within a conditional if-else branch.
This favors spilling registers used within conditional branches which
are likely to be executed less frequently than registers used at the
top level.

Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on
my SKL GT4e.  Should have a comparable effect on other platforms.  No
significant regressions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-11 15:28:54 -07:00
Tim Rowley
9a7b257450 swr: return true for PIPE_CAP_DOUBLES
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-11 13:16:43 -05:00
Kenneth Graunke
02ccd8f52c i965: Set kernel features before computing max GL version.
We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail.  That sets the max core profile GL version to 4.2.

This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.

GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
2017-04-11 08:58:16 -07:00
Juan A. Suarez Romero
8d7a82ae32 anv: remove needless VALGRIND_MAKE_MEM_DEFINED
This is already invoked in the following VG_NOACCESS_READ() call.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-11 17:21:57 +02:00
Lucas Stach
4ee7c2c284 etnaviv: enable TS, but disable autodisable
Autodisable seems to cause missed rendering in some cases, but
otherwise TS seems to work properly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-04-11 16:52:31 +02:00
Lucas Stach
797890bbbd etnaviv: enable TS also on sampler resources
Fixes a performance issue with imported winsys buffers as those are
marked with binding sampler view.

This might require a TS flush on single pipe chips that directly
sample from the rendered buffer, but otherwise seems to work fine.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-04-11 16:52:27 +02:00
Lucas Stach
52f6c8cc31 etnaviv: align TS surface size to number of pixel pipes
The TS surface gets cleared by a tiled RS fill. If the chip has
more than 1 pixel pipe the size of the TS surface needs to be
aligned so that each pipe address matches a tile start, otherwise
the RS will hang.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-04-11 16:52:22 +02:00
Lucas Stach
37622ecc79 etnaviv: avoid using invalid TS
The TS is only valid after it has been initialized by a fast
clear, so it should not be taken into account when blitting
resources that haven't been cleared. Also the blit itself
invalidates the destination TS, as it's not updated and will
retain data from the previous rendering after the blit.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-04-11 16:52:01 +02:00
Samuel Pitoiset
768f81b62b glsl: use the BA1 macro for textureQueryLevels()
For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-11 10:24:57 +02:00
Samuel Pitoiset
981ba1c89b glsl: use the BA1 macro for textureSamples()
For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-11 10:24:54 +02:00
Samuel Pitoiset
29082b0b22 glsl: use the BA1 macro for textureCubeArrayShadow()
For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-11 10:24:51 +02:00
Bas Nieuwenhuizen
8475a14302 radv: Implement pipeline statistics queries.
The devil is in the shader again, otherwise this is
fairly straightforward.

The CTS contains no pipeline statistics copy to buffer
testcases, so I did a basic smoketest.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen
d2906bc72d radv: Let count be dynamic in radv_break_on_count.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen
8473193760 radv: Rename query pipeline/set layout.
For using them with both occlusion and pipeline statistics queries.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen
95743d5b88 radv: Use VK_WHOLE_SIZE for the query buffer bindings.
The buffer sizes are specified just a few lines earlier, so don't
repeat ourselves.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen
8911dd6d12 radv: Use a shader for occlusion CmdCopyQueryPoolResults.
Use the new occlusion query copy shader.

We don't use the shader for the waiting as a polling loop ineracts badly
with having caching enabled. I noticed on my GPU (Tonga) that the values
are written out in order, so I just use a WAIT_REG_MEM on the last value.

If it turns out other chips don't do that we may need to look a bit more
into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal.

This also restricts the availability word in the pool to timestamp queries
only, as occlusion queries don't use it, and pipeline statistic queries
likely won't either.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen
ce0c8cf941 radv: Add occlusion query shader.
Adds a shader for writing occlusion query results to a buffer, as the
CP packet isn't support on SI or secondary buffers, and doesn't handle
the availability bit (or partial results) nor truncation to 32-bit.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-11 09:33:17 +02:00
Kenneth Graunke
50b987c0f0 i965: Fix wonky indentation left by brw_bo_alloc_tiled rename. 2017-04-10 23:25:13 -07:00
Ilia Mirkin
d9cc58d6ec nouveau: when mapping a persistent buffer, synchronize on former xfers
If the buffer is being used, we should wait for those uses to be
complete before returning the map.

Fixes: GL45-CTS.direct_state_access.buffers_functional
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-04-11 00:13:55 -04:00
Ilia Mirkin
8036809799 nvc0: increase texture buffer object alignment to 256 for pre-GM107
We currently don't pass the low byte of the address via the surface
info, so in order to work with images, these have to implicitly be
aligned to 256. The proprietary driver also doesn't go out of its way to
provide lower alignment.

Fixes GL45-CTS.texture_buffer.texture_buffer_texture_buffer_range

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-11 00:13:55 -04:00
Timothy Arceri
8ffd54fef8 mesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref()
This function should only be used with a "freshly created" renderbuffer
so assert RefCount is 1.
2017-04-11 09:57:45 +10:00
Kenneth Graunke
bd84252be6 i965/drm: Add stall warnings when mapping or waiting on BOs.
This restores the performance warnings removed in:

    i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings.

but adds them for nearly all BO mapping, and also for wait_rendering.

Because we add this to the core bufmgr, we automatically get stall
warnings in all callers, unlike before where only a few callsites used
the wrappers that gave stall warnings.

We also do it a bit differently: we simply measure how long set_domain
takes (the part that stalls), and complain if it's more than 0.01 ms.
We don't bother calling brw_bo_busy(), and we don't measure the mmap
time (which doesn't stall).  This should be more accurate.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-04-10 14:33:18 -07:00
Kenneth Graunke
f053ee78ed i965/drm: Make a set_domain() helper function.
Less boilerplate.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-04-10 14:33:18 -07:00
Daniel Vetter
a99a4979fd i965/batch: Ensure we use a consistent offset in relocs
In theory gcc is free to re-load them, and if a concurrent
execbuf races and updates bo->offset64 then we have a problem:
execbuffer api requires that the ->presumed_offset and the one
we used for the reloc matches. It does not require that the value
is sensible, which means no locks needed, just a consistent load.

Ken said his next series will nuke this, so just hand-roll the
kernel's READ_ONCE idea inline.

FIXME: Most callers of brw_emit_reloc recompute the relocation
themselves, which means this doesn't really fix the race. But the long
term plan is to move to per-context relocation handling, which will
fix this all properly. So leave this for now as just a reminder.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:18 -07:00
Daniel Vetter
7f3c85c21e i965/bufmgr: Garbage-collect vma cache/pruning
This was done because the kernel has 1 global address space, shared
with all render clients, for gtt mmap offsets, and that address space
was only 32bit on 32bit kernels.

This was fixed  in

commit 440fd5283a87345cdd4237bdf45fb01130ea0056
Author: Thierry Reding <treding@nvidia.com>
Date:   Fri Jan 23 09:05:06 2015 +0100

    drm/mm: Support 4 GiB and larger ranges

which shipped in 4.0. Of course you still want to limit the bo cache
to a reasonable size on 32bit apps to avoid ENOMEM, but that's better
solved by tuning the cache a bit. On 64bit, this was never an issue.

On top, mesa never set this, so it's all dead code. Collect an trash it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:18 -07:00
Daniel Vetter
1f965d3f7a i965/bufmgr: Remove some reuse functions
is_reusable was needed by uxa because it couldn't keep track of its
scanout buffers and used this as a proxy. Disabling reuse is a silly
idea, we set this once at start. Remove both.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:18 -07:00
Daniel Vetter
edd85c1f04 i965/bufmgr: remove start_gtt_access
Iirc this was used by uxa for persistent mmpas of the frontbuffer. For
mesa all the set_domain stuff needed before a synchronized mmap is handled
within the bufmgr, so no reason ever to call this.

Inline the implementation into its only internal user.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:17 -07:00
Daniel Vetter
439edaa4b5 i965/bufmgr: Delete set_tiling
Entirely unused, and really shouldn't be used. The alloc functions already
take care of this. And even in a future where we're not going to
h/v-align tiled buffers in the bufmgr, but only in isl, I think we
still want to adjust the tiling mode in the bufmgr, since that ties in
closely to mmaps and stuff like that.

get_tiling is still needed for the import paths (until we have modifiers
everywhere).

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:17 -07:00
Daniel Vetter
6308121475 i965/bufmgr: Delete alloc_for_render
Entirely unused, mesa instead used the BO_ALLOC_FOR_RENDER flag.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-10 14:33:14 -07:00
Kenneth Graunke
538fa87f40 i965/drm: Use list_for_each_entry_safe in a couple of cases.
Suggested by Chris Wilson.  A tiny bit simpler.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-04-10 14:33:12 -07:00
Kenneth Graunke
10929da5fb i965/drm: Rename intel_bufmgr_gem.c to brw_bufmgr.c.
Matches the class name and the header file name.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:32 -07:00
Kenneth Graunke
7aa66e64fe i965/drm: Reindent intel_bufmgr_gem.c and brw_bufmgr.h.
indent -i3 -nut -br -brs -npcs -ce --no-tabs -Tuint32_t -Tuint64_t
plus some manual fixes because those aren't quite the right settings.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:30 -07:00
Kenneth Graunke
d30a92738c i965/drm: Rename drm_bacon_bo to brw_bo.
The bacon is all gone.

This renames both the class and the related functions.  We're about to
run indent on the bufmgr code, so no need to worry about fixing bad
indentation.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:28 -07:00
Kenneth Graunke
e0d15e9769 i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings.
The stupid reason for eliminating these functions is that I'm about
to rename drm_bacon_bo_map() to brw_bo_map(), which makes the real
function have the short name, rather than the wrapper.

I'm also planning on reworking our mapping code soon, so we use WC
mappings and proper unsynchronized mappings on non-LLC platforms.
It will be easier to do that without thinking about the stall
warnings and wrappers.

My eventual hope is to put the performance warnings in the BO map
function itself, so all callers gain the warning.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:25 -07:00
Kenneth Graunke
dfd81373b6 i965/drm: Rename drm_bacon_reg_read() to brw_reg_read().
Less bacon.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:24 -07:00
Kenneth Graunke
662a733dbc i965/drm: Rename drm_bacon_bufmgr to struct brw_bufmgr.
Also stop using typedefs, per Mesa coding style.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:21 -07:00
Kenneth Graunke
f5216b25e0 i965: Just use a uint32_t context handle rather than a malloc'd wrapper.
drm_bacon_context is a malloc'd struct containing a uint32_t context ID
and a pointer back to the bufmgr.  The bufmgr pointer is pretty useless,
as everybody already has brw->bufmgr.  At that point...we may as well
just use the ctx_id handle directly.  A number of places already had to
call drm_bacon_gem_context_get_id() to extract the ID anyway.  Now they
just have it.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:20 -07:00
Kenneth Graunke
4cb3e4429d i965/drm: Fold drm_bacon_gem_reset_stats into the callers.
We're going to get rid of drm_bacon_context shortly, so we'd have to
change the interface slightly.  It's basically just an ioctl wrapper
that isn't terribly bufmgr-related, so We may as well just combine it
with the code in brw_reset.c that actually uses it.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:19 -07:00
Kenneth Graunke
414c9343a2 i965/drm: Rename drm_bacon_gem_bo_bucket to bo_cache_bucket.
No need for a prefix as this struct is local to the .c file.

Less bacon.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:17 -07:00
Kenneth Graunke
e46b74d1b5 i965/drm: Drop drm_bacon_* from static functions.
Mesa style is to not use lengthy prefixes for static functions.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:16 -07:00
Kenneth Graunke
13596ecb6b i965/drm: Drop drm_bacon_gem_bo_madvise_internal().
The only difference is that it takes an explicit bufmgr rather than
using bo->bufmgr, but there is only one bufmgr per screen so they
should be identical anyway.

Chris says this was added primarly to avoid bo/bo_gem casting,
which was inconvenient.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:15 -07:00
Kenneth Graunke
9ee252865e i965/drm: Merge drm_bacon_bo_gem into drm_bacon_bo.
The separate class gives us a bit of extra encapsulation, but I don't
know that it's really worth the boilerplate.  I think we can reasonably
expect the rest of the driver to be responsible.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:14 -07:00
Kenneth Graunke
59fdd94b85 i965/drm: Merge bo->handle and bo_gem->gem_handle.
These fields are the same value.  In the bad old days, bo->handle could
have been an identifier from the pre-GEM fake bufmgr, but that's long
gone.  Keep the "gem_handle" name for clarity.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:08 -07:00
Kenneth Graunke
eb41aa82c4 i965/drm: Rewrite relocation handling.
The execbuf2 kernel API requires us to construct two kinds of lists.
First is a "validation list" (struct drm_i915_gem_exec_object2[])
containing each BO referenced by the batch.  (The batch buffer itself
must be the last entry in this list.)  Each validation list entry
contains a pointer to the second kind of list: a relocation list.
The relocation list contains information about pointers to BOs that
the kernel may need to patch up if it relocates objects within the VMA.

This is a very general mechanism, allowing every BO to contain pointers
to other BOs.  libdrm_intel models this by giving each drm_intel_bo a
list of relocations to other BOs.  Together, these form "reloc trees".

Processing relocations involves a depth-first-search of the relocation
trees, starting from the batch buffer.  Care has to be taken not to
double-visit buffers.  Creating the validation list has to be deferred
until the last minute, after all relocations are emitted, so we have the
full tree present.  Calculating the amount of aperture space required to
pin those BOs also involves tree walking, which is expensive, so libdrm
has hacks to try and perform less expensive estimates.

For some reason, it also stored the validation list in the global
(per-screen) bufmgr structure, rather than as an local variable in the
execbuffer function, requiring locking for no good reason.

It also assumed that the batch would probably contain a relocation
every 2 DWords - which is absurdly high - and simply aborted if there
were more relocations than the max.  This meant the first relocation
from a BO would allocate 180kB of data structures!

This is way too complicated for our needs.  i965 only emits relocations
from the batchbuffer - all GPU commands and state such as SURFACE_STATE
live in the batch BO.  No other buffer uses relocations.  This means we
can have a single relocation list for the batchbuffer.  We can add a BO
to the validation list (set) the first time we emit a relocation to it.
We can easily keep a running tally of the aperture space required for
that list by adding the BO size when we add it to the validation list.

This patch overhauls the relocation system to do exactly that.  There
are many nice benefits:

- We have a flat relocation list instead of trees.
- We can produce the validation list up front.
- We can allocate smaller arrays and dynamically grow them.
- Aperture space checks are now (a + b <= c) instead of a tree walk.
- brw_batch_references() is a trivial validation list walk.
  It should be straightforward to make it O(1) in the future.
- We don't need to bloat each drm_bacon_bo with 32B of reloc data.
- We don't need to lock in execbuffer, as the data structures are
  context-local, and not per-screen.
- Significantly less code and a better match for what we're doing.
- The simpler system should make it easier to take advantage of
  I915_EXEC_NO_RELOC in a future patch.

Improves performance in Synmark 7.0's OglBatch7:

    - Skylake GT4e: 12.1499% +/- 2.29531%  (n=130)
    - Apollolake:   3.89245% +/- 0.598945% (n=35)

Improves performance in GFXBench4's gl_driver2 test:

    - Skylake GT4e: 3.18616% +/- 0.867791% (n=229)
    - Apollolake:   4.1776%  +/- 0.240847% (n=120)

v2: Feedback from Chris Wilson:
    - Omit explicit zero initializers for garbage execbuf fields.
    - Use .rsvd1 = ctx_id rather than i915_execbuffer2_set_context_id
    - Drop unnecessary fencing assertions.
    - Only use _WR variant of execbuf ioctl when necessary.
    - Shrink the arrays to be smaller by default.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:32:00 -07:00
Kenneth Graunke
e7ab0ea5e7 i965/drm: Make register write check handle execbuffer directly.
I'm about to rewrite how relocation handling works, at which point
drm_bacon_bo_emit_reloc() and drm_bacon_bo_mrb_exec() won't exist
anymore.  This code is already largely not using the batchbuffer
infrastructure, so just go all the way and handle relocations, the
validation list, and execbuffer ourselves.  That way, we don't have
to think the weird case where we only have a screen, and no context,
when redesigning the relocation handling.

v2: Write reloc.presumed_offset + reloc.delta into the batch, rather
    than duplicating the comment, so it's obvious that they match
    (suggested by Chris).  Also add a comment about why we don't do
    any error checking.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:56 -07:00
Kenneth Graunke
6368284a34 i965: Make a screen::aperture_threshold field.
This is the threshold after which drm_intel_bufmgr_check_aperture_space
returns -ENOSPC, signalling that it thinks an execbuf is likely to fail
and we need to roll back and flush the batch.

We'll need this when we rewrite aperture space checking, shortly.
In the meantime, we can also use it in GLX_MESA_query_renderer.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:55 -07:00
Kenneth Graunke
6079f4f16e i965: Make/use a brw_batch_references() wrapper.
We'll want to change the implementation of this shortly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:54 -07:00
Kenneth Graunke
6537a3ca11 i965: Use brw_emit_reloc() instead of drm_bacon_bo_emit_reloc().
I'm about to make brw_emit_reloc do actual work, so everybody needs
to start using it and not the raw drm_bacon function.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:52 -07:00
Kenneth Graunke
eadd5d1b51 i965: Change intel_batchbuffer_reloc() into brw_emit_reloc().
This renames intel_batchbuffer_reloc to brw_emit_reloc and changes the
parameter naming and ordering to match drm_intel_bo_emit_reloc().

For now, it's a trivial wrapper that accesses batch->bo.  When we
rework relocations, it will start doing actual work.

target_offset should be expanded to a uint64_t to match the kernel,
but for now we leave it as its original 32-bit type.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:51 -07:00
Kenneth Graunke
fbb3297165 i965/drm: Drop GEM_SW_FINISH stuff.
This is only useful when doing an incoherent CPU mapping of the current
scanout buffer.  That's a terrible plan, so we never do it.  We always
use an uncached GTT map.

So, this is useless.  Drop the code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:49 -07:00
Kenneth Graunke
80761a42e0 i965/drm: Drop code to search for an existing bufmgr.
This functionality was added by libdrm commit
743af59669386cb6e063fa4bd85f0a0b2da86295 (intel: make bufmgr_gem
shareable from different API) in an attempt to solve libva/mesa buffer
sharing problems.  Specifically, this was working around an issue hit
by Chromium, which used the same drm_fd for multiple APIs, and shared
buffers between them.

This code attempted to work around that issue by using the same bufmgr
for both libva and Mesa.  It worked because libdrm_intel was loaded by
both libraries.  However, now that Mesa has forked, we don't have a
common library, and this code cannot work.

The correct solution is to have each API open its own file descriptor
(and get a corresponding buffer manager), and then use PRIME export
and import to share BOs across those APIs.  Then the kernel can manage
those shared resources.  According to Chris, the kernel will pass back
the same handle for a prime FD if the lookup is from the same device FD.

We believe Chromium has since moved to this model.

In Mesa, there is already only one screen per FD, and so there will
only be one bufmgr per FD.  We don't need any of this code.

v2: Add a big warning comment written by Chris Wilson.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:48 -07:00
Kenneth Graunke
b666654201 i965/drm: Unwrap the unnecessary drm_bacon_reloc_target_info struct.
This used to have another field, but now it's just a BO pointer.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:46 -07:00
Kenneth Graunke
2662894baa i965/drm: Switch from uthash to Mesa's hash table.
No performance data has been gathered about this choice.  I just don't
want that many hash tables.  Chris points out that this is not
performance critical - we should not be recreating that many handles
from scratch.  In the past we used a linear list, which became
unreasonable in stress tests that used hundreds of thousands of BOs.
In real usage, it shouldn't matter that much.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:45 -07:00
Kenneth Graunke
ad1b1cce44 i965/drm: Drop bo_gem::kflags.
It's always zero now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:43 -07:00
Kenneth Graunke
a972c903cb i965/drm: Drop has_exec_async related API.
Mesa doesn't use this yet.  We'll almost certainly want to, but we can
add the functionality back after we clean up the messy drm code.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:42 -07:00
Kenneth Graunke
d606f64e2d i965/drm: Drop softpin support for now.
We may want this eventually, but simplify for now.  We can add it back
later when we actually intend to use it.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:41 -07:00
Kenneth Graunke
0314eed3b1 i965/drm: Drop userptr support for now.
We'll want userptr support for GL_AMD_pinned_memory support someday,
and possibly some other upload optimizations.  Chris says "not in this
form" though.  Drop it and simplify for now - we can add it back later
when we're ready to hook it up fully.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:39 -07:00
Kenneth Graunke
a460e1eb51 i965/drm: Delete engine checks.
This is basically handholding to prevent a bogus caller from trying to
execbuffer on a bogus engine.  i965 already does this correctly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:37 -07:00
Kenneth Graunke
1dc02da6d7 i965/drm: Drop intel_chipset.h in favor of using gen_device_info.
This moves the PCI ID detection to intel_screen.c and makes
drm_bacon_bufmgr_gem_init() take a devinfo pointer.

We also drop the HAS_LLC query stuff - devinfo has that info already,
without kernel queries, and it makes no sense to have two has_llc flags
set by different mechanisms.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:36 -07:00
Kenneth Graunke
55ee8f36a8 i965/drm: Drop deprecated drm_bacon_bo::offset.
This field was the wrong size, so we replaced it with offset64.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:35 -07:00
Kenneth Graunke
a29fb9b2ee i965/drm: Drop has_wait_timeout.
The wait-ioctl was introduced in kernel v3.6 (20120930) and that is our
current minimum requirement for screen creation.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:33 -07:00
Kenneth Graunke
b97bcf3b6b i965/drm: Assume aperture size query will work.
This query has been available since 2.6.28.  We require 3.6.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:32 -07:00
Kenneth Graunke
c28691ab77 i965/drm: Combine drm_bacon_bufmgr_gem and drm_bacon_bufmgr classes.
The distinction was required when the bufmgr was virtualised, now there
is only one class, we no longer need the distraction of pretending it is
a subclass.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:31 -07:00
Kenneth Graunke
3673b89bf3 i965/drm: Move _drm_bacon_context to intel_bufmgr_gem.c.
This moves us one step closer to killing off intel_bufmgr_priv.h.

We might want to nuke it altogether, since it's basically just a
uint32_t handle, but for now, let's focus on removing files.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:29 -07:00
Kenneth Graunke
b0d1c5983b i965/drm: Drop cliprects and dr4 from execbuf variants.
Legacy DRI1 leftovers.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:28 -07:00
Kenneth Graunke
2c257ff226 i965/drm: Devirtualize the bufmgr.
libdrm_bacon used to have a GEM-based bufmgr and a legacy fake bufmgr,
but that's long since dead (and we never imported it to i965).  So,
drop the extra layer of function pointers.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:27 -07:00
Kenneth Graunke
dca224a9ef i965/drm: Check INTEL_DEBUG & DEBUG_BUFMGR directly.
Eliminates some API around this, and more importantly, the last
field in one bufmgr class.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:25 -07:00
Kenneth Graunke
68cb0c6d92 i965/drm: Use Mesa's macros.h instead of duplicating them.
Replace the duplicated macros imported from libdrm:

   ARRAY_SIZE, MAX2, ALIGN, STATIC_ASSERT

and remove unused ROUND_UP_TO and ROUND_UP_TO_MB.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:24 -07:00
Kenneth Graunke
c5cdb0f405 i965/drm: Use ALIGN, not ROUND_UP_TO.
ROUND_UP_TO handles a NPOT alignment, but all the alignments we use
are power of two anyway, so there's no need.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:23 -07:00
Kenneth Graunke
1d476e64e5 i965/drm: Delete execbuf1 support.
execbuf2 has been around since v2.6.33.  We require v3.6.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:21 -07:00
Kenneth Graunke
ddf01d3f41 i965/drm: Remove Gen2-3 fence accounting.
Since gen4, we do not use fence registers for any GPU access and so
never have to account for the fence during batch construction. All the
related fence functions are unused.

Based on Kristian Høgsberg's patch; commit message by Chris Wilson.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:20 -07:00
Kenneth Graunke
4f698b0049 i965/drm: Remove some unused functions and macros.
Mesa doesn't use these functions or macros, so we can delete them,
and save work refactoring and cleaning them up.  We'll delete a lot
more later, too.

Based on a patch by Kristian Høgsberg.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:18 -07:00
Kenneth Graunke
09b2f6124a i965/drm: Switch to util/list.h instead of libdrm_lists.h.
Both are kernel style lists, so this is trivial.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:16 -07:00
Kenneth Graunke
7c64096b2d i965/drm: Port to Mesa's atomic header.
Drop xf86atomic.h in favor of Mesa's util/u_atomic.h.  We replace the
atomic_t wrapper struct with a bare integer, switch to the 'p_atomic'
naming conventions, and move over the one extra helper.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:13 -07:00
Kenneth Graunke
eed86b975e i965/drm: Use our internal libdrm (drm_bacon) rather than the real one.
Now we can actually test our changes.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:11 -07:00
Kenneth Graunke
91b973e3a3 i965/drm: s/drm_intel/drm_bacon/g
Using drm_intel_* as a prefix is hazardous - we don't want to conflict
with the actual libdrm_intel symbols.  In particular, I think we could
get into trouble during the final megadrivers linking.

So, rename everything to an different yet arbitrary prefix.  bacon and
intel are the same number of characters, so we don't have to reindent
the world.  It's also an homage to Ian's "Bacon Trail" platform.

I was going to use "drm_relic" to poke fun at libdrm being ancient,
and so we could explain the name with a "historical reasons" pun,
but it sounds too much like ralloc.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:09 -07:00
Kenneth Graunke
4ad0758f51 i965/drm: Drop libpciaccess dependencies.
i965 doesn't use drm_intel_get_aperture_sizes(), so we can delete
support for it.  This avoids a build dependency on libpciaccess.

Chris also notes:

"There's a really old bug that hopefully has been closed already
 (although as far as I can tell, it has never been fixed) about
 how using libpciaccess from libdrm_intel breaks the world (since
 libpciaccess uses a singleton that is torn down at the first request
 rather than upon the last user)."

This bug should go away in two commits when we switch over to our
internal copy of libdrm_intel.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84325
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:05 -07:00
Kenneth Graunke
d614135e95 i965/drm: Make libdrm_lists.h compile by defining typeof.
typeof doesn't seem to exist, so this won't compile (but we don't yet
try).  Define it to __typeof__.  This code is going to die soon anyway.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:03 -07:00
Kenneth Graunke
b97c7ef4c8 i965/drm: remove legacy defines, aub functions, and decoder prototypes
We never imported any of this code, so drop the prototypes, unused
enums, and defines.

Based on patches by Emil Velikov.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:31:00 -07:00
Kenneth Graunke
514db96c11 i965: Import libdrm_intel.
This imports commit 19c4cfc54918d361f2535aec16650e9f0be667cd of
libdrm/intel/*.[ch], minus a few files that we're never going to use
(and would immediately delete), plus a few necessary dependencies.

We rename intel_bufmgr.h to brw_bufmgr.h to avoid #include conflicts.
We also fix UTF-8 symbol problems in intel_bufmgr_gem.c comments
because vim keeps trying to fix that every time I edit the file,
and we may as well fix it right away.

Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:53 -07:00
Kenneth Graunke
915820cc59 i965: Make sure we don't use CPU maps for the scanout buffer.
Using an incoherent CPU map on the active scanout buffer is really
sketchy - we may need extra flushing via GEM_SW_FINISH, or using
drmModeDirtyFB() and kernel commit a6a7cc4b7db6d (4.10+).

Chris suggests "never ever do that", which seems like a wise plan!

intel_miptree_map_raw() uses CPU maps on linear buffers.

Having a linear scanout buffer should be really rare, and mapping the
front buffer should be similarly rare.  Together, it should basically
never happen.  But, in case it does somehow...make sure that mapping
the scanout buffer always goes through an uncached GTT map.

v2: Add a giant comment written by Chris Wilson.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-10 14:30:49 -07:00
Kenneth Graunke
eb28ce2b0b i965: Stop calling drm_intel_bufmgr_gem_enable_fenced_relocs().
This does nothing on Gen4+, which is the only hardware we support.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:44 -07:00
Kenneth Graunke
034b220dc4 i965: Fix GLX_MESA_query_renderer video memory on 32-bit.
On modern systems with 4GB apertures, the size in bytes is 4294967296,
or (1ull << 32).  The kernel gives us the aperture size as a __u64,
which works out great.

Unfortunately, libdrm "helpfully" returns the data as a size_t, which
on 32-bit systems means it truncates the aperture size to 0 bytes.
We've happily reported this value as 0 MB of video memory via
GLX_MESA_query_renderer since it was originally exposed.

This patch bypasses libdrm and calls the ioctl ourselves so we can
use a proper uint64_t, avoiding the 32-bit integer overflow.  We now
report a proper video memory size on 32-bit systems.

Chris points out that the aperture size (CPU mappable size limit)
isn't really the right thing to be checking.  But libdrm_intel uses
it to fail execbuffer, so it is an actual limit for now.  Once that's
fixed we can probably move to something else.  In the meantime, fix
the obvious typecasting bug.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-10 14:30:40 -07:00
Samuel Pitoiset
5bcfe90501 gallium/radeon: add HUD queries for GPU temperature and clocks
Only the Radeon kernel driver exposed the GPU temperature and
the shader/memory clocks, this implements the same functionality
for the AMDGPU kernel driver.

These queries will return 0 if the DRM version is less than 3.10,
I don't explicitely check the version here because the query
codepath is already a bit messy.

v2: - rebase on top of master

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:06:19 +02:00
Samuel Pitoiset
0f39fb8500 configure.ac: require libdrm_amdgpu 2.4.79
The sensor info requires amdgpu_query_sensor_info().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:06:17 +02:00
Samuel Pitoiset
def02007cd radeonsi: add new si_check_render_feedback_texture() helper
For bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:41 +02:00
Samuel Pitoiset
fbcc8664fd radeonsi: add new si_decompress_color_texture() helper
For bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:38 +02:00
Samuel Pitoiset
6646212de0 radeonsi: add new depth_needs_decompression() helper
v2: - rename to depth_needs_decompression() instead

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:32 +02:00
Samuel Pitoiset
9cc91ba6d5 radeonsi: add a 'break' in si_check_render_feedback_*()
No need to check all color buffers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:29 +02:00
Samuel Pitoiset
51d6641700 radeonsi: re-use 'desc' in si_set_shader_image()
No need to compute the offset in the descriptor twice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:05:27 +02:00
Samuel Pitoiset
a1c37ff9e4 ac: add unreachable() in ac_build_image_opcode()
To silent the following compiler warning:

common/ac_llvm_build.c: In function ‘ac_build_image_opcode’:
common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32",
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    name,
    ~~~~~
    a->compare ? ".c" : "",
    ~~~~~~~~~~~~~~~~~~~~~~~
    a->bias ? ".b" :
    ~~~~~~~~~~~~~~~~
    a->lod ? ".l" :
    ~~~~~~~~~~~~~~~
    a->deriv ? ".d" :
    ~~~~~~~~~~~~~~~~~
    a->level_zero ? ".lz" : "",
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~
    a->offset ? ".o" : "",
    ~~~~~~~~~~~~~~~~~~~~~~
    type);
    ~~~~~

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 23:02:12 +02:00
Constantine Kharlamov
61e47d92c5 r600g: get rid of dummy pixel shader
The idea is taken from radeonsi. The code mostly was already checking for null
pixel shader, so little checks had to be added.

Interestingly, acc. to testing with GTAⅣ, though binding of null shader happens
a lot at the start (then just stops), but draw_vbo() never actually sees null
ps.

v2: added a check I missed because of a macros using a prefix to choose
a shader.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 22:45:22 +02:00
Constantine Kharlamov
544b40089b r600g: add draw_vbo check for a NULL pixel shader
Taken from radeonsi, required to remove dummy pixel shader in the next patch

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 22:45:22 +02:00
Constantine Kharlamov
22de96680c r600g: skip repeating vs, gs, and tes shader binds
The idea is taken from radeonsi. The code lacks some checks for null vs,
and I'm unsure about some changes against that, so I left it in place.

Some statistics for GTAⅣ:
Average tesselation bind skip per frame: ≈350
Average geometric shaders bind skip per frame: ≈260
Skip of binding vertex ones occurs rarely enough to not get into per-frame
counter at all, so I just gonna say: it happens.

v2: I've occasionally removed an empty line, don't do this.
v3: return a check for null tes and gs back, while I haven't figured out
the way to move stride assignment to r600_update_derived_state() (as it
is in radeonsi).

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 22:45:22 +02:00
Bartosz Tomczyk
a4019a81ab mesa: use single memcpy when strides match in glReadPixels, texstore code
v2: fix indentation

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-10 14:42:17 -06:00
Jason Ekstrand
da2ac19511 intel/blorp: Use ISL for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-10 07:57:21 -07:00
Jason Ekstrand
d3785dcb2f intel/blorp: Emit 3DSTATE_STENCIL_BUFFER before HIER_DEPTH
We're about to replace blorp's emit code with ISL and it emits them in
the other order.  This makes diffing the aubs easier.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-10 07:57:21 -07:00
Jason Ekstrand
f93dc5beee anv: Use ISL for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-10 07:57:21 -07:00
Jason Ekstrand
bf95f7c209 intel/isl: Add support for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-10 07:57:21 -07:00
Thomas Hindoe Paaboel Andersen
957ccbe04a amd/addrlib: use correct variable name in header
Since the inclusion in 7f160efcde
the header used x_biased, while the implementation used y_biased.
This changes the header to macth the implementation since the
uses of the function seems to expect y_biased.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-10 12:44:59 +10:00
Timothy Arceri
d0791ac2ed mesa/st: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
d9fe82fe41 x11: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
a85b4e5719 osmesa: tidy up renderbuffer refCount initialisation
32141e53d1 changed _mesa_init_renderbuffer() to set it to 1 for
us.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
e6d6266e6f swrast: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
6c02387b2c radeon: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
1b85009ec1 nouveau: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
3387f66cab i965: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
c355675440 i915: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-10 10:55:34 +10:00
Timothy Arceri
074a485d35 mesa: create _mesa_attach_renderbuffer_without_ref() helper
This will be used to take ownership of freashly created renderbuffers,
avoiding the need to call the reference function which requires
locking.

V2: dereference any existing fb attachments and actually attach the
    new rb.

v3: split out validation and attachment type/complete setting into
    a shared static function.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>
2017-04-10 10:55:34 +10:00
Ilia Mirkin
89253d5c67 nv50/ir: remove unused swizzle field in ValueRef
The nv50 ir is scalar. Perhaps this was from some early attempts to
integrate the simd aspects of nv30. However at this point it's entirely
unused.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-09 14:59:42 -04:00
Boyan Ding
b1b189a0ab nouveau: enable ARB_shader_clock on nv50 and nvc0
v2: Also enable support on nv50

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-09 13:03:13 -04:00
Boyan Ding
6c3dd8f0ed nv50/ir: Handle TGSI_OPCODE_CLOCK
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
[imirkin: make zero mov non-fixed]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-09 13:03:13 -04:00
Boyan Ding
e2e2c69927 gm107/ir: Emit SV_CLOCK system value
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-09 13:03:13 -04:00
Ben Widawsky
6e907812f8 gbm: Assert modifiers and count are copacetic
The API/entry point in mesa already checks the correct behavior,
however, it's possible to be handled by another implementation and those
implementations should not be able to abuse a weird combination of count
and pointer.

This fixes CID 1403193

Cc: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-09 09:29:57 -07:00
Gustaw Smolarczyk
a2eae66b8b st/mesa: Use compressed fog mode for atifs.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
8a4b93b1d9 mesa/main/ff_frag: Use compressed TexEnv Combine state.
Along the way, add missing GL_ONE source support and drop non-existing
GL_ZERO and GL_ONE operand support.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
f7c9bf0c6b mesa/main/ff_frag: Use compressed fog mode.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
837ad2dc38 mesa/main: Maintain compressed TexEnv Combine state.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
6fa34de830 mesa/main: Maintain compressed fog mode.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
c9b2938aec mesa/main/ff_frag: Don't retrieve format if not necessary.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
885012aab2 mesa/main/ff_frag: Use gl_texture_object::TargetIndex.
Instead of computing it once again using _mesa_tex_target_to_index.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
a86891a9a9 mesa/main/ff_frag: Store nr_enabled_units only once.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
0e89ab0d6e mesa/main/ff_frag: Simplify get_fp_input_mask.
Change it into filter_fp_input_mask transform function that instead of
returning a mask, transforms input.

Also, simplify the case of vertex program handling by assuming that
fp_inputs is always a combination of VARYING_BIT_COL* and VARYING_BIT_TEX*.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
f5e685da06 mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.
It's not used.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
03b9b3c471 mesa/main/ff_frag: Remove unused struct.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
ceb5ba9d1d mesa/main/ff_frag: Reduce the size of nr_enabled_units.
Since it holds values from 0 to 8, 4 bits will suffice.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
439eca951f mesa/main/ff_frag: Remove enabled_units.
Its only usage is easily replaced by nr_enabled_units. As for cache key
part, unit[i].enabled should be enough.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:58 +02:00
Gustaw Smolarczyk
3cc91537fa mesa/main/ff_frag: Use correct constant.
Since fixed-function shaders are restricted to MAX_TEXTURE_COORD_UNITS
texture units, use this constant instead of MAX_TEXTURE_UNITS. This
reduces the array size from 32 to 8.

Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 20:29:57 +02:00
Jason Ekstrand
098ca9949d intel/isl: Use genx_bits.h instead of a hand-rolled table
This gets rid of one piece of ugliness with the way ISL handles surface
emitting surface states.  I've never liked that hand-rolled table but it
was the best we had at the time.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-07 22:34:04 -07:00
Jason Ekstrand
b85d75b3e8 intel/genxml/bits: Emit per-container _length helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-07 22:34:04 -07:00
Jason Ekstrand
f97e251ab2 intel/genxml/bits: Emit per-field _start helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-07 22:34:04 -07:00
Jason Ekstrand
430e697868 intel/genxml/bits: Pull the function emit code into a helper block
The helper block is extremely general.  It takes an string property name
and an object that supports three methods: has_prop, iter_prop, and
get_prop.  This way we can easily generalize it to emit more different
types of getter functions.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-07 22:34:04 -07:00
Jason Ekstrand
2d52e65d03 intel/genxml/bits: Refactor to add a container class
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-04-07 22:34:04 -07:00
Ilia Mirkin
57a744025a nvc0/ir: fix overwriting of offset register with interpolateAtOffset
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-04-07 23:31:01 -04:00
Jason Ekstrand
bc68aa42bd anv: Use subpass dependencies for flushes
Instead of figuring it all out ourselves, just use the information given
to us by the client.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Jason Ekstrand
e5bbf8be36 anv/pass: Record required pipe flushes
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Jason Ekstrand
0039d0cf27 anv/pass: Use anv_multialloc for allocating the anv_pass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Jason Ekstrand
415633a722 anv/descriptor_set: Use anv_multialloc for descriptor set layouts
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Jason Ekstrand
e5c29b8c27 anv: Add a helper for doing mass allocations
We tend to try to reduce the number of allocation calls the Vulkan
driver uses by doing a single allocation whenever possible for a data
structure.  While this has certain downsides (usually code complexity),
it does mean error handling and cleanup is much easier.  This commit
adds a nice little helper struct for getting rid of some of that
complexity.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Jason Ekstrand
82695c32b6 anv: Add helpers for converting access flags to pipe bits
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-04-07 19:24:14 -07:00
Timothy Arceri
9d69416a7e mesa: simplify and optimise vertex bindings tracking
We only need to update it if something changes. Also
_mesa_bind_vertex_buffer() will update the mask when binding to a
NULL or default buffer so no need to do that update here.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-04-08 11:18:50 +10:00
Timothy Arceri
bfabef0e71 glsl: fix lower jumps for nested non-void returns
Fixes the case were a loop contains a return and the loop is
nested inside an if.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
https://bugs.freedesktop.org/show_bug.cgi?id=100303
2017-04-08 11:18:32 +10:00
Ilia Mirkin
5dd490f134 gallium: fix some math formulas to display better
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-07 20:20:17 -04:00
Ilia Mirkin
60f5766db4 nvc0/ir: fix LSB/BFE/BFI implementations
Overwriting the src register is a very bad idea - it logically maps onto
the TGSI registers, and so is effectively overwriting the source values.

Reported-by: Boyan Ding <boyan.j.ding@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-04-07 20:20:16 -04:00
Nicolai Hähnle
c05cf9cf1b util: fix swizzle of INSTANCEID system value
radeonsi added stricter checking for correct swizzles in debug builds.

Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf2942777 ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-08 00:44:52 +02:00
Bruce Cherniak
07b5b5cfd4 st/glx: Add awareness for multisample pixel formats to st/glx-xlib.
In preparation for enabling MSAA in OpenSWR, the state trackers need to
be aware of multisample pixel formats for software renderers.  This patch
allows glx-xlib to query the renderer for support of pixel
formats with multisample, and create multisample resources.

This change is benign to softpipe and llvmpipe, as is_format_supported
returns FALSE for any sample_count > 1.  OpenSWR does the same at the
moment, but that will change soon.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-07 16:50:58 -05:00
Tim Rowley
7bd5057fd1 swr: fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-07 16:50:41 -05:00
Brian Paul
8046c247de glx: silence uninitialized var warning
Signed-off-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
ee3f75f538 st/mesa: silence unused/uninitialized var warnings
Signed-off-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
c77c381fae gallivm: init vars to silence gcc warnings
Silence warnings about using possibly uninitialized values.

Signed-off-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:44 -06:00
Charmaine Lee
16bd2c6d04 svga: add context pointer to the invalidate surface interface
With this patch, we will specify the current context
when we invalidate the surface before the surface is
put back to the recycled surface pool. This allows the
winsys layer to use the specified context to do the
invalidation rather than using the last context that
referenced the surface. This prevents race condition if
the last referenced context is now made current in another thread.

Tested with MTT glretrace, NobelClinicianViewer.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
e000b17f87 winsys/svga: use c11 thread types/functions
Gallium no longer has wrappers for mutexes and condition variables.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-07 13:46:44 -06:00
Thomas Hellstrom
0864f9c77a winsys/svga: Resolve command submission buffer contention v3
If two contexts wanted to access the same buffer at the same time, it would
end up on two validation lists simultaneously, which might cause a
PIPE_ERROR_RETRY when trying to validate it from one context while the other
context already had it validated but not yet fenced.

In that situation we could spin until the error goes away, or apply various
more or less expensive locking schemes to save cpu.
Here we use a scheme that briefly locks after fencing but avoids locking on
validation in the non-contended case.

v2:
Make sure we broadcast not only on releasing buffers after fencing, but also
after releasing buffers in the pb_validate_validate error path.
v3:
Don't broadcast on PIPE_ERROR_RETRY because that would increase the chance
of starvation.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
0baa372b6f svga: remove pre-SVGA3D_HWVERSION_WS8_B1 code
3D wasn't officially supported before virtual HW version 8 so we can
remove this old code.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
690fe77835 st/wgl: sort strings in stw_extension_string[] array
Trivial.
2017-04-07 13:46:44 -06:00
Charmaine Lee
b1c964447a svga: remove redundant surface propagation
Currently, surface propagation for colliding render target resource is
done at framebuffer emit time for vgpu10. This patch
adds the surface propagation for non-vgpu10 path to emit_fb_vgpu9()
and removes the redundant surface copy at set time.

Tested with MTT glretrace, piglit, NobelClinicianViewer, Turbine, Cinebench.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
2017-04-07 13:46:44 -06:00
Charmaine Lee
35a748e79c svga: Fix zslice index to svga_texture_copy_handle_resource()
The zslice index to svga_texture_copy_handle_resource() is not adjusted
and should be a signed integer.

This patch fixes piglit tests for non-vgpu10 including
   spec@arb_framebuffer_object@fbo-generatemipmap-3d
   spec@glsl-1.20@execution@tex-miplevel-selection gl2:texture* 3d

Tested with MTT piglit and glretrace
2017-04-07 13:46:44 -06:00
Brian Paul
5637a497a3 svga: specify include path for git_sha1.h for out-of-src builds
If we're doing an out-of-src build, we need to specify the #include
patch to find git_sha1.h

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-04-07 13:46:44 -06:00
Brian Paul
c78fc70e8c st/wgl: pseudo-implementation of WGL_EXT_swap_control
This implementation is based on querying the time just before swap/present
and doing a Sleep() if needed.  There is no sync to vblank or actual
coordination with the GPU.  This isn't perfect, but basically works.

We've had some request for this functionality, and it sounds like there
are some Windows GL apps that refuse to start if the driver doesn't
advertise this extension.

Note: NVIDIA's Windows OpenGL driver advertises the WGL_EXT_swap_control
string both with wglGetExtensionsStringEXT() and with
glGetString(GL_EXTENSIONS).  We're only advertising it with the former at
this time.

Tested with asst. Mesa demos, Google Earth, Lightsmark, etc.

VMware bug 1591534.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
ab96d1baf4 svga: Fix out-of-sync backing surface
When a backing surface is reused, it is possible that
the original surface has been changed. So before the backing surface
is bound again, we need to sync up the surface.
This patch creates a new helper function svga_texture_copy_handle_resource()
to sync up the backing surface resource.

This patch, together with the backing surface dirty bit fix, fixes
the rendering corruption in NobelClinicianViewer when rotating the model.

Also tested with MTT glretrace, piglit, Cinebench, Turbine.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
a08e3b88ab svga: add a reset flag to svga_propagate_surface()
The reset flag specifies if the dirty bit needs to be reset
after the surface is propagated to the texture. This is used
to make sure that the dirty bit is not reset and stay unset
before the surface is unbound.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
02c9bf2d54 svga: add the has_backed_views flag
The new has_backed_views flag specifies if any of the render target
views or depth stencil view is a backing surface view.
The flag is used in svga_propagate_rendertargets() so it can return early
if there is no surface to propagate.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
a421d45e61 svga: only destroy render target view from a context that created it
A texture can be destroyed from a different context from which it is
created, but destroying the render target view from a different context
will cause svga device errors. Similar to shader resource view,
this patch skips destroying render target view or depth stencil view
from a non-parent context.

Fixes driver errors running NobelClinician Viewer application.

Tested with NobelClinician Viewer, MTT piglit, glretrace.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
b4c4ee0762 svga: disable rasterization if rasterizer_discard is set or FS undefined
With this patch, rasterization will be disabled if the
rasterizer_discard flag is set or the fragment shader
is undefined due to missing position output from the
vertex/geometry shader.

Tested with piglit test glsl-1.50-geometry-primitive-id-restart.
Also tested with full MTT glretrace and piglit.

v2: As suggested by Roland, to properly disable rasterization, besides
    setting FS to NULL, we will also need to disable depth and stencil test.

v3: As suggested by Brian, set SVGA_NEW_DEPTH_STENCIL_ALPHA dirty bit
    in svga_bind_rasterizer_state() if the rasterizer_discard flag is
    changed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Charmaine Lee
fed72ff6cb svga: do not emulate wide points in GS when doing transform feedback
Emulating wide points in geometry shader when doing transform feedback
is problematic. This patch disables the emulation.

Tested with piglit test ext_transform_feedback-points.
Also tested with MTT glretrace, mesa demos pointblast and spriteblast.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-07 13:46:43 -06:00
Jason Ekstrand
4e17b59f6c anv/query: Use snooping on !LLC platforms
Commit b2c97bc789 which made us start
using a busy-wait for individual query results also messed up cache
flushing on !LLC platforms.  For one thing, I forgot the mfence after
the clflush so memory access wasn't properly getting fenced.  More
importantly, however, was that we were clflushing the whole query range
and then waiting for individual queries and then trying to read the
results without clflushing again.  Getting the clflushing both correct
and efficient is very subtle and painful.  Instead, let's side-step the
problem by just snooping.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-07 12:17:20 -07:00
Emil Velikov
5318d1ff94 anv: provide anv_gem_busy() stub for the tests
Otherwise linking way fail.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600
Fixes: f195d40eca ("anv/device: Add a helper for querying whether a BO is busy")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-04-07 19:45:58 +01:00
Rob Clark
3b32ec3ba6 gallium/util: tweak backtrace format with libunwind
To work with addr2line.sh we also need the relative offset within the
DSO.  And addr2line.sh gets confused by the leading stackframe number.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-07 08:23:02 -04:00
Rob Clark
91dfa02125 gallium/util: cache symbol lookup with libunwind
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-07 08:23:02 -04:00
Rob Clark
7c69ea553b gallium/util: fix missing limit check in libunwind backtrace
Fixes: 70c272004f ("gallium/util: libunwind support")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-07 08:23:02 -04:00
Timothy Arceri
8046a944d0 mesa: fix renderbuffer leak
We don't need to call _mesa_reference_renderbuffer() for the first
assignment as refCount starts at 1. For swrast we work around the
fact we will indirectly call _mesa_reference_renderbuffer() by
resetting refCount to 0.

Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-04-07 19:48:10 +10:00
Samuel Iglesias Gonsálvez
1c934bc71b anv/blorp: sample input attachments with resolves on BDW
On Broadwell we still need to do a resolve between the subpass
that writes and the subpass that reads when there is a
self-dependency because HW could not see fast-clears and works
on the render cache as if there was regular non-fast-clear surface.

Fixes 16 tests on BDW:

dEQP-VK.renderpass.formats.*.input.clear.store.self_dep*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-07 07:49:43 +02:00
Fredrik Höglund
fd0f539e60 radv: don't call radeon_check_space in radv_BindDescriptorSets
This appears to be a leftover from an earlier version of this function.
Nothing is emitted into the CS.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund
c1f8c83cb6 radv: implement VK_KHR_descriptor_update_template
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.

v2: Move the new struct declarations from radv_descriptor_set.h
    to radv_private.h (Bas)

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund
c6487bc48b radv: implement VK_KHR_push_descriptor
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund
3b33f03913 radv: replace an assertion with a conditional
Replace the !binding_layout->immutable_samplers assertion in
radv_update_descriptor_sets with a conditional.

The Vulkan specification does not say that it is illegal to update
a sampler descriptor when it is immutable; only that pImageInfo is
ignored.

This change is also needed for push descriptors, because valid
descriptors must be pushed for all bindings accessed by shaders,
including immutable sampler descriptors.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund
a6e94a87cb radv: refactor radv_UpdateDescriptorSets
Move the implementation into a separate function that takes a
cmd_buffer and a dstSetOverride parameter.

When cmd_buffer is not NULL, radv_update_descriptor_sets calls
cs_add_buffer directly instead of updating the buffer list.

This will be used to implement VK_KHR_push_descriptor.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Samuel Pitoiset
bedd89429f gallium/radeon: fix typo in radeon_winsys.h
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-07 00:48:19 +02:00
Samuel Pitoiset
7839243085 mesa/main: simplify _mesa_IsRenderbuffer()
_mesa_lookup_renderbuffer() already checks if 'id' is non-zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-04-07 00:48:01 +02:00
Timothy Arceri
93d7014c1d mesa: stop abstracting texture object hashtable locking
This doesn't do anything useful so just remove it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-07 08:03:02 +10:00
Timothy Arceri
31cb6fd0a3 mesa: stop abstracting buffer object hashtable locking
This doesn't do anything useful so just remove it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-04-07 08:02:54 +10:00
Jason Ekstrand
c9c39812b9 i965/blorp: Bump the batch space estimate
Commit f938354362 recently increased the
alignment on vertex buffer data from 32 to 64.  This caused us to
consume a bit more batch than we were before and we now go over the
estimate by a small amount on certain blits on gen8+.  This commit bumps
then gen8 batch estimate by a bit to compensate.  Haswell and older
still seems to be well within the limit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-06 13:32:29 -07:00
Jordan Justen
0370350d11 intel/aubinator: Stop searching after a custom handler is found
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-06 13:26:08 -07:00
Jordan Justen
d5bd0e411e intel/gen_decoder: return -1 for unknown command formats
Decoding with aubinator encountered a command of 0xffffffff. With the
previous code, it caused aubinator to jump 255 + 2 dwords to start
decoding again.

Instead we can attempt to detect the known instruction formats. If the
format is not recognized, then we can advance just 1 dword.

v2:
 * Update aubinator_error_decode
 * Actually convert the length variable returned into a *signed* integer
   in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-06 13:26:08 -07:00
Jordan Justen
7c33372f82 intel/gen_decoder: Fix length for Media State/Object commands
From BDW PRM, Volume 6: Command Stream Programming, 'Render Command
Header Format'.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-06 13:26:08 -07:00
Jordan Justen
3c77a57222 intel/aubinator_error_decode: Fix structure decode data
The call to gen_print_group should provide a pointer to the beginning
of the the structure data, not the start of the batch data.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-06 13:25:38 -07:00
Nicolai Hähnle
2357e7a202 st/pbo: select the right swizzle for instance IDs
The system value only has an X component, and radeonsi started
checking that in debug builds.

Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf2942777 ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-06 20:26:27 +02:00
Jason Ekstrand
b2c97bc789 anv/query: Busy-wait for available query entries
Before, we were just looking at whether or not the user wanted us to
wait and waiting on the BO.  Some clients, such as the Serious engine,
use a single query pool for hundreds of individual query results where
the writes for those queries may be split across several command
buffers.  In this scenario, the individual query we're looking for may
become available long before the BO is idle so waiting on the query pool
BO to be finished is wasteful. This commit makes us instead busy-loop on
each query until it's available.

This significantly reduces pipeline bubbles and improves performance of
The Talos Principle on medium settings (where the GPU isn't overloaded
with drawing) by around 20% on my SkyLake gt4.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
2017-04-05 21:17:11 -07:00
Jason Ekstrand
f195d40eca anv/device: Add a helper for querying whether a BO is busy
This is a bit more efficient than using GEM_WAIT with a timeout of 0.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-05 21:17:11 -07:00
Tim Rowley
d5157ddca4 swr: [rasterizer core] SIMD16 Frontend WIP
Implement widened binner for SIMD16

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:20:45 -05:00
Tim Rowley
b8515d5c0f swr: [rasterizer core] Enable 8x2 backend
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:20:45 -05:00
Tim Rowley
c1b7a5780d swr: [rasterizer codegen] remove copy of mako
mako is already a mesa build requirement, extra copy not needed.

Tested building against mesa build baseline (mako-0.8.0).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:20:45 -05:00
Tim Rowley
97dab87a22 swr: [rasterizer core/memory] Move intrinics to _simd functions
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:20:19 -05:00
Tim Rowley
117fc582f8 swr: [rasterizer core] Programmable sample position support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:19:25 -05:00
Tim Rowley
3c52a7316a swr: [configure.ac/scons] require c++14
New C++ features used by upcoming swr changes.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:19:16 -05:00
Tim Rowley
e5fdfcf836 swr: [rasterizer core] Fix center sample pattern
Fix long hidden bug in rasterizer handling of center sample pattern.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:19:10 -05:00
Tim Rowley
c12b61d158 swr: [rasterizer core/memory] Fix missing avx512 storetile
Fix pre-processor macro handing to eliminate silently missing
implementation for AVX512.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:19:04 -05:00
Tim Rowley
cd6c200223 swr: [rasterizer core] SIMD16 Frontend WIP
Implement widened VS output for SIMD16

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-04-05 18:18:36 -05:00
Timothy Arceri
1bfeb65397 mesa: use internal function when deleting buffers
This avoids validation and looking up the buffer target for a second time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-06 08:25:36 +10:00
Timothy Arceri
8feb5bb402 mesa: rework bind_buffer_object()
This allows internal users to pass buffer objects directly and
allows for KHR_no_error support to be more easily added.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-06 08:25:36 +10:00
Timothy Arceri
d1c1544a49 mesa: small texstate tidy up
Possibly more efficient, either way it makes the code easier to
follow.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-06 08:25:36 +10:00
Timothy Arceri
32141e53d1 mesa: tidy up renderbuffer RefCount initialisation
42aaa548 changed the renderbuffer initialisation of RefCount from
1 to 0.

This is inconsitent with how we use RefCount elsewhere. Also every
driver implementation of NewRenderbuffer() calls
_mesa_init_renderbuffer() so its safe to set it there.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-06 08:17:10 +10:00
Christian Gmeiner
e75001811e Revert "etnaviv: Cannot render to rb-swapped formats"
This reverts commit 658568941d.

With the help of shader variants we can render to rb-swapped
formats now. Fixes about 60 piglits.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:58:25 +02:00
Christian Gmeiner
7f62ffb68a etnaviv: add support for rb swap
If we render to rb swapped format we will create a shader variant doing
the involved swizzing in the pixel shader.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:58:22 +02:00
Christian Gmeiner
8d9a31ef97 etnaviv: adapt shader-db output for variant support
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-05 19:58:18 +02:00
Christian Gmeiner
20fa8f1989 etnaviv: bring back shader-db traces
If shader-db run, create a standard variant immediately
(as otherwise nothing will trigger the shader to be
actually compiled).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-05 19:58:13 +02:00
Christian Gmeiner
7d2a806266 etnaviv: add etna_shader_key and generate variants if needed
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:58:10 +02:00
Christian Gmeiner
9da54fdcb5 etnaviv: pass a preallocated variant to compiler
In the long run the compiler needs to know the specifc variant
'key' in order to compile appropriate assembly. With this commit
the variant knows its shader and we are able pass the preallocated
variant into etna_compile_shader(..). This saves us from passing
extra ptrs everywhere.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-05 19:58:07 +02:00
Christian Gmeiner
ffd4762310 etnaviv: make specs const
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:58:03 +02:00
Christian Gmeiner
ecc2474e59 etnaviv: add struct etna_shader_state
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-05 19:57:59 +02:00
Christian Gmeiner
65e9bd2703 etnaviv: add basic shader variant support
This commit adds some basic infrastructure to handle shader
variants. We are still creating exactly one shader variant
for each shader.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-04-05 19:57:56 +02:00
Christian Gmeiner
59b459ac17 etnaviv: s/etna_shader/etna_shader_variant
Prep work to add shader variant support.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:57:52 +02:00
Christian Gmeiner
54e367bf0e etnaviv: remove not needed forward declarations
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-04-05 19:57:47 +02:00
Emil Velikov
13181abc6d gallium/util: honour LIBUNWIND_CFLAGS
Fixes: 70c272004f ("gallium/util: libunwind support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 18:42:56 +01:00
Rhys Kidd
115e684792 travis: Add radeonsi to continuous integration
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 18:19:51 +01:00
Rhys Kidd
787ab42716 travis: Add radv vulkan driver to continuous integration
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 18:19:28 +01:00
Emil Velikov
a6840efc09 anv: provide required gem stubs for the tests
Introduce stubs to anv_gem_stub.c that match the anv_gem.c ones.
Otherwise we may get link-time errors, when building the tests.

v2: Introduce all the missing stubs at once.

Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Vinson Lee <vlee@freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100574
Fixes: c964f0e485 ("anv: Query the kernel for reset status")
Fixes: 651ec926fc ("anv: Add support for 48-bit addresses")
Fixes: 060a6434ec ("anv: Advertise larger heap sizes")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
---
I've intentionally kept the order the same identical to the anv_gem.c.
This way we can easily grep & diff in the future ;-)
2017-04-05 17:54:38 +01:00
Emil Velikov
8307124829 configure.ac: pthread-stubs is not a thing on GNU/kFreeBSD
As mentioned on the xcb mailing list, the platform uses the GLIBC
forwarding mechanism.

https://lists.freedesktop.org/archives/xcb/2016-November/010896.html

Cc: Andreas Boll <andreas.boll.dev@gmail.com>
Reported-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 17:47:41 +01:00
Aaron Watry
4d0399f175 st/clover: Fix build after shrink of pipe_box
Fixes: 3dfe61e ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100569
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-04-05 09:19:48 -05:00
Alex Deucher
d921af62f5 radeonsi: add new polaris10 pci id
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-05 10:13:08 -04:00
Nicolai Hähnle
9e1b2e4d97 radeonsi: enable ARB_shader_ballot
Require LLVM 5.0 or later because LLVM 4.0 is easily fooled into
putting the lane select of llvm.amdgcn.readlane into a VGPR and then
fails to continue to compile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:44 +02:00
Nicolai Hähnle
8b13b11f11 radeonsi: optimization barriers to work around LLVM deficiencies
Notably, llvm.amdgcn.readfirstlane and llvm.amdgcn.icmp may be hoisted
out of loops or if/else branches in cases like

  if (cond) {
    v = readFirstInvocationARB(x);
    ... use v ...
  } else {
    v = readFirstInvocationARB(x);
    ... use v ...
  }
===>
  v = readFirstInvocationARB(x);
  if (cond) {
    ... use v ...
  } else {
    ... use v ...
  }

The optimization barrier is a heavy hammer to stop that until LLVM
is taught the semantics of the intrinsic properly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:44 +02:00
Nicolai Hähnle
24d4fbe226 radeonsi: strengthen emit_optimization_barrier
LLVM will lift inline assembly out of if-else-blocks if both paths have
the same inline assembly. Prevent this by adding an irrelevant unique
text to the assembly.

This requires the LLVM assembly parser to be initialized.

Furthermore, allow forcing subsequent computations to happen after the
optimization barrier by defining a data dependency.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:43 +02:00
Nicolai Hähnle
5c4602f4a2 radeonsi: emit TGSI_OPCODE_READ_*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:43 +02:00
Nicolai Hähnle
b46e3a30b7 radeonsi: emit TGSI_OPCODE_BALLOT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:43 +02:00
Nicolai Hähnle
a3075f4799 radeonsi: implement TGSI_SEMANTIC_SUBGROUP_*
64-bit system values are stored as v2i32 to simplify the fetch logic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:43 +02:00
Nicolai Hähnle
4cf2942777 radeonsi: support 64-bit system values
For simplicitly, always store system values as 32-bit values or arrays
of 32-bit values. 64-bit values are unpacked and packed accordingly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:43 +02:00
Nicolai Hähnle
1ee57b16be radeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES
ARB_shader_ballot introduces 7 new system values that can be used
in all shader stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:42 +02:00
Nicolai Hähnle
ee2d93eb92 st/mesa: enable ARB_shader_ballot
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:42 +02:00
Nicolai Hähnle
84039cc1c3 st/glsl_to_tgsi: implement ARB_shader_ballot system variables
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:42 +02:00
Nicolai Hähnle
76e3dba289 st/glsl_to_tgsi: implement ARB_shader_ballot builtin functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:41 +02:00
Ilia Mirkin
08bd0aa507 tgsi: add SUBGROUP_* semantics
v2: add documentation (Nicolai)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:41 +02:00
Ilia Mirkin
3650d7455f tgsi: add BALLOT/READ_* opcodes
v2 (Nicolai):
- BALLOT isn't per-channel
- expand the documentation (also for VOTE_*)

v3:
- only BALLOT returns a 64-bit lanemask (Boyan)
- relax the requirement on READ_INVOC: the invocation number to read
  from must be uniform within a sub-group. This matches the
  GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD
  GCN)

v4:
- hopefully really fix the doc of VOTE_* returns (Ilia)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2017-04-05 15:29:34 +02:00
Nicolai Hähnle
d3e6f6d7f7 gallium: add PIPE_CAP_TGSI_BALLOT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:29:31 +02:00
Nicolai Hähnle
b5711d5e1a glsl: add gl_SubGroup*ARB builtins
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:25:56 +02:00
Nicolai Hähnle
961b8e9afe glsl: add ARB_shader_ballot builtin functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:25:54 +02:00
Nicolai Hähnle
d37b7b5232 glsl: add ARB_shader_ballot operations
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:25:51 +02:00
Nicolai Hähnle
b8440ec9fa glsl: add ARB_shader_ballot enable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:25:48 +02:00
Nicolai Hähnle
4fdb691f10 mesa: add GL_ARB_shader_ballot boilerplate
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 15:25:40 +02:00
Emil Velikov
2c4c47dcb7 swr: automake: add gen_common.py to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 13:16:28 +01:00
Emil Velikov
e664cfc5a7 intel: genxml: automake: include gen_bits_header.py in the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 13:16:28 +01:00
Emil Velikov
e180680980 intel: genxml: automake: polish automake rules
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 13:16:28 +01:00
Emil Velikov
e2adec3a17 amd/addrlib: automake: add all headers to the tarball
Fixes: 7f160efcde ("amd/addrlib: import gfx9 support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-05 13:16:28 +01:00
Nicolai Hähnle
570e50af4b radeonsi: enable ARB_sparse_buffer
v2:
- fill in DRM version requirement
- disable on SI due to CP DMA faults

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:44:32 +02:00
Nicolai Hähnle
aee473eb01 radeonsi: disable SDMA clears and copies for sparse buffers
VM faults cannot be disabled for SDMA on <= VI.

We could still use SDMA by asking the winsys about which parts of the
buffers are committed. This is left as a potential future improvement.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:19 +02:00
Nicolai Hähnle
0a685ce9a7 gallium/radeon: implement pipe->resource_commit
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:19 +02:00
Nicolai Hähnle
e077c5fe65 gallium/radeon: transfers and invalidation for sparse buffers
Sparse buffers can never be mapped by the CPU.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:19 +02:00
Nicolai Hähnle
5969a373a1 gallium/radeon: implement sparse buffer creation
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:19 +02:00
Nicolai Hähnle
47e59a7e36 winsys/amdgpu: sparse buffer debugging helpers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:19 +02:00
Nicolai Hähnle
0baee15596 winsys/amdgpu: take fences when freeing a backing buffer
We never add fences to backing buffers during submit. When we free a
backing buffer, it must inherit the sparse buffer's fences, so that it
doesn't get re-used prematurely via the cache.

v2:
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:18 +02:00
Nicolai Hähnle
79dae12b41 winsys/amdgpu: add sparse buffers to CS
... and implement the corresponding fence handling.

v2:
- add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:18 +02:00
Nicolai Hähnle
667da4eaed winsys/amdgpu: sparse buffer creation / destruction / commitment
This is the bulk of the buffer allocation logic. It is fairly simple and
stupid. We'll probably want to use e.g. interval trees at some point to
keep track of commitments, but Mesa doesn't have an implementation of those
yet.

v2:
- remove pipe_mutex_*
- fix total_backing_pages accounting
- simplify by using the new VA_OP_CLEAR/REPLACE kernel interface

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:18 +02:00
Nicolai Hähnle
e348248647 winsys/amdgpu: add sparse buffer data structures
v2:
- remove pipe_mutex_*
- use a simple page commitment array

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:18 +02:00
Nicolai Hähnle
f3e514361c winsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:18 +02:00
Nicolai Hähnle
ae4f442304 winsys/amdgpu: build handles and flags list late on submit thread
This probably has only minor performance effects, but it simplifies some
subsequent code slightly.

Ideally, it could also be used to simplify the handling of slab buffers
in the same way, but unfortunately that's not possible as long as we need
indices for relocations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:17 +02:00
Nicolai Hähnle
0e476f6c03 winsys/amdgpu: share common code in amdgpu_add_fence_dependencies
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:17 +02:00
Nicolai Hähnle
1c125fdef0 winsys/amdgpu: extract amdgpu_do_add_real_buffer
We will use it for delayed adding of sparse buffers' backing buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:17 +02:00
Nicolai Hähnle
a338f427ac winsys/radeon: sparse buffers will not be supported
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:17 +02:00
Nicolai Hähnle
c2637a17d9 radeon/winsys: add sparse buffer interface
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:17 +02:00
Nicolai Hähnle
d9bc4d8305 st/mesa: plumbing for sparse buffers
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:16 +02:00
Nicolai Hähnle
2599b23f7c st/mesa: enable ARB_sparse_buffer when supported
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:16 +02:00
Nicolai Hähnle
634266c952 trace: add resource_commit pass-through
v2: fix return type to bool (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:16 +02:00
Nicolai Hähnle
0e1c75acae ddebug: add resource_commit pass-through
v2: fix return type to bool (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:16 +02:00
Nicolai Hähnle
d6e6fa01a5 gallium: add sparse buffer interface and capability
v2:
- explain the resource_commit interface in more detail

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:37:04 +02:00
Nicolai Hähnle
4e6feacf6a mesa: implement sparse buffer commitment
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:31:02 +02:00
Nicolai Hähnle
d6fcbe1c2a mesa: implement sparse storage buffer allocation
v2:
- spec quote and style (Ian)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:31:01 +02:00
Nicolai Hähnle
94227684c4 mesa: implement SPARSE_BUFFER_PAGE_SIZE_ARB
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:31:01 +02:00
Nicolai Hähnle
d085c7ce7c mesa: Add GL_ARB_sparse_buffer boilerplate
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:31:01 +02:00
Nicolai Hähnle
a0970de839 configure.ac: require libdrm_amdgpu 2.4.77
The sparse buffer implementation requires amdgpu_bo_va_op_raw.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-05 10:30:42 +02:00
Matt Turner
d5ee55f028 mesa: Replace program locks with atomic inc/dec.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-05 14:54:49 +10:00
Jason Ekstrand
060a6434ec anv: Advertise larger heap sizes
Instead of just advertising the aperture size, we do something more
intelligent.  On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU.  In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
651ec926fc anv: Add support for 48-bit addresses
This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware.  Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary.  In particular, general
and state base address need to live within 32 bits.  (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects.  While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
439da38d18 anv: Replace anv_bo::is_winsys_bo with a uint32_t flags
Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
f938354362 i965/blorp: Align vertex buffers to 64B
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
5d1ba2cb04 anv/blorp: Align vertex buffers to 64B
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
c964f0e485 anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-04 18:33:52 -07:00
Jason Ekstrand
82573d0f75 anv: Check for device loss at the end of WaitForFences
It's possible that the device could have been lost while we were
waiting.  We should let the user know if this has happened.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-04 18:33:51 -07:00
Jason Ekstrand
c6f69eea6a anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
When the shader does not set one of these values, they are supposed to
get a default value of 0.  We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them.  This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:51 -07:00
Jason Ekstrand
3503b2714b i965/fs: Always provide a default LOD of 0 for TXS and TXL
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages.  However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V.  Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations.  This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-04-04 18:33:35 -07:00
Kenneth Graunke
c5bf7cb529 mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.
This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.

Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-04 17:35:18 -07:00
Vinson Lee
c161a10462 libgl-xlib: Link with libunwind.
Fix linking error.

  CXXLD    libGL.la
../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture':
src/gallium/auxiliary/util/u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext'
src/gallium/auxiliary/util/u_debug_stack.c:60: undefined reference to `_ULx86_64_init_local'
src/gallium/auxiliary/util/u_debug_stack.c:62: undefined reference to `_ULx86_64_step'
src/gallium/auxiliary/util/u_debug_stack.c:71: undefined reference to `_ULx86_64_get_proc_info'
src/gallium/auxiliary/util/u_debug_stack.c:73: undefined reference to `_ULx86_64_get_proc_name'
src/gallium/auxiliary/util/u_debug_stack.c:65: undefined reference to `_ULx86_64_step'

Fixes: 70c272004f ("gallium/util: libunwind support")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100562
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-04-04 16:47:41 -07:00
Jason Ekstrand
1fde054b8f intel/isl: Refactor and clerify gen8 alignment calculations
Adding the actual table from the docs makes it clearer exactly what the
restrictions are.  In particular, it becomes clear that compressed
textures ignore the alignment parameters in RENDER_SURFACE_STATE.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-04-04 14:51:57 -07:00
Francisco Jerez
0de17f52a5 drirc: Set glsl_zero_init for Kerbal Space Program.
This fixes the stripes of garbage rendered on the floor of the vehicle
assembly building among other rendering issues.  The reason for the
misrendering seems to be that some of the GLSL shaders used by the
application use variables before initializing them, incorrectly
assuming that they will be implicitly set to zero by the
implementation.

Acked-by: Matt Turner <mattst88@gmail.com>
2017-04-04 14:13:03 -07:00
Lionel Landwerlin
e8d9b76f63 intel: tools: add aubinator_error_decode tool
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

v2 (from Matt): Drop authors
                Remove undefined automake variable

v3: Fix incorrect offsets for dword > 1 (Jordan)

v4: Fix decompression error with large blobs (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
567d77885e intel: genxml: add RING_BUFFER_CTL registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
6f260ff049 intel: genxml: add FAULT_REG register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
ca2771fa18 intel: genxml: add gen7 ERR_INT register
v2: add register to gen7.5 (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
84613bf6d5 intel: genxml: add ACTHD registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
0f195f22aa intel: genxml: add GFX_ARB_ERROR_RPT register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Lionel Landwerlin
d1a7a54d77 intel: genxml: add INSTDONE registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-04 21:22:26 +01:00
Marek Olšák
18b12bf533 targets: export radeon winsys_create functions to silence LLVM warning
It silences the following radeonsi LLVM warning due to a previous
commit adding an LLVM workaround:
  "mesa: for the -simplifycfg-sink-common option: may only occur zero or one
   times!"

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by; Emil Velikov <emil.velikov@collabora.com>
2017-04-04 22:15:47 +02:00
Constantine Kharlamov
6ee486899b r600g: check rasterizer primitive states like in radeonsi
Specifically, non-line primitives skipped, and defaulting to reset on
each packet.

The skip of non-line primitives saves ≈110 resetting of
PA_SC_LINE_STIPPLE register per frame in Kane&Lynch2.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-04-04 22:15:47 +02:00
Constantine Kharlamov
7ade08e2a8 r600g: extract a code into a r600_emit_rasterizer_prim_state()
Also change gs_output_prim type: unsigned → pipe_prim_type. The idea of
the code is mostly taken from radeonsi. The new code operating on
prev/curr rast_primitives saves ≈15 reloads of PA_SC_LINE_STIPPLE per
frame in Kane&Lynch2

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-04-04 22:15:47 +02:00
Constantine Kharlamov
fa8bc90990 r600g/radeonsi: use the correct types (taken from pipe_draw_info)
Note: si_shader.h has also "type" variable that should be changed to
"enum pipe_prim_type", however it triggers a bunch of warnings about
unhandled switches, so due not knowing the correct way to handle them, I
decided to leave it as is.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-04-04 22:15:47 +02:00
Constantine Kharlamov
ef62a7651c r600g: remove duplicate memset by using a pointer, and constify args
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-04-04 22:15:47 +02:00
Elie TOURNIER
ba5b1ab3e0 glsl: remove unused file
udivmod64 appears in src/compiler/glsl/builtin_int64.h and src/compiler/glsl/udivmod.h
The second file seems unused.
Fix commit 6b03b345eb

This change doesn't affect shader-db.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-04-04 18:37:42 +01:00
Marek Olšák
6ca46c3d77 radeonsi: access gallivm through ctx in most places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 16:55:21 +02:00
Marek Olšák
04e4fe594b radeonsi: use ctx->types instead of bld->types etc.
even vec_type is f32.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 16:55:19 +02:00
Marek Olšák
7a5e6dcba5 radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 16:55:16 +02:00
Marek Olšák
7216e1d8af gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
295f4f56cb gallium: decrease the size of pipe_vertex_element - 16 -> 8 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
e6428092f5 gallium: decrease the size of pipe_resource - 64 -> 48 bytes
Some other changes needed here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
3dfe61ed6e gallium: decrease the size of pipe_box - 24 -> 16 bytes
Also:

pipe_transfer: 48 -> 40 bytes.
pipe_blit_info = 176 -> 160 bytes.

v2: add a comment at pipe_box

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
9869a3b3ba gallium: decrease the size of pipe_sampler_view - 48 -> 32 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
4648bc2a8f gallium: decrease the size of pipe_surface - 48 -> 40 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
eb0fd0e5f8 gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
19bc74f513 gallium: decrease the size of pipe_stream_output_info - 532 -> 268 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
15ff2f7aa9 gallium: decrease the size of pipe_rasterizer_state - 36 -> 32 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
18e760346a amd/addrlib: second update for Vega10 + bug fixes
Highlights:
- Display needs tiled pitch alignment to be at least 32 pixels
- Implement Addr2ComputeDccAddrFromCoord().
- Macro-pixel packed formats don't support Z swizzle modes
- Pad pitch and base alignment of PRT + TEX1D to 64KB.
- Fix support for multimedia formats
- Fix a case "PRT" entries are not selected on SI.
- Fix wrong upper bits in equations for 3D resource.
- We can't support 2d array slice rotation in gfx8 swizzle pattern
- Set base alignment for PRT + non-xor swizzle mode resource to 64KB.
- Bug workaround for Z16 4x/8x and Z32 2x/4x/8x MSAA depth texture
- Add stereo support
- Optimize swizzle mode selection
- Report pitch and height in pixels for each mip
- Adjust bpp/expandX for format ADDR_FMT_GB_GR/ADDR_FMT_BG_RG
- Correct tcCompatible flag output for mipmap surface
- Other fixes and cleanups

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
3e7d62a774 radeonsi: use i32_0 and i32_1 more
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
29adaa19ac radeonsi: remove most uses of lp_build_const*
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
7cec96a038 radeonsi: clean up 'radeon_bld' references
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Marek Olšák
0fb5a505fa radeonsi: fix broken texture filtering on SI-CIK since GFX9 changes
Don't clear state[7] on SI-CIK, and only do the meta stuff on VI+.
Fixes: 5abf60076c ("radeonsi/gfx9: image descriptor changes in mutable fields")

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100531
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-04 11:14:43 +02:00
Juan A. Suarez Romero
1bcdf74cdd bin/get-fixes-pick-list.sh: fix typo
Replace "nore" by "more".

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-04 09:05:44 +02:00
Mauro Rossi
72175bd2a5 android: intel: genxml: fix genX_xml.h generation rules
Recent changes in Makefile.sources merged the aubinator files in
a unique list of generated files and genxml/genX_xml.h is now needed
to avoid the following building error:

ninja: error: '.../genxml/genX_xml.h', needed by '.../genxml/genX_xml.h',
missing and no known rule to make it
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed

Fixes: 0f83c05 "intel: genxml: compress all gen files into one"
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-04 09:10:46 +03:00
Jason Ekstrand
405ef7bb33 intel/vec4: Add some fall through comments
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 16:58:35 -07:00
Bartosz Tomczyk
64b3aa7ad8 mesa/glthread: Avoid unnecessary batch reallocation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-04 09:56:52 +10:00
Bas Nieuwenhuizen
6e5e8a2e49 radv: Increase descriptor limits.
We supported more generally. Decreased the dynamic buffers though, as
we only support 16 for uniform+storage.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-04-04 01:47:47 +02:00
Bartosz Tomczyk
95720851e2 mesa/glthread: fix misaligned address access
Address sanitizer reports lot of misaligned access:
SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in
main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type
'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment
0x631000104866: note: pointer points here
 92 88 00 00 00 00  00 00 4a 03 0c 00 93 88  00 00 00 00 00 00 02 01  0c 00 40 8d 00 00 00 00  00 00
             ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in
main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type
'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment
0x6310003fc874: note: pointer points here
  01 00 00 00 7a 02 20 00  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be
              ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in
main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type
'GLint' (aka 'int'), which requires 8 byte alignment
0x6310003fc87c: note: pointer points here
  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be  be be be be be be be be

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-04 09:39:03 +10:00
Bartosz Tomczyk
bcb63ee63e glsl: Fix blob memory leak
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-04-04 09:22:29 +10:00
Bas Nieuwenhuizen
a4c4efad89 radv: Rework guard band calculation.
We want the guardband_x/y to be the largerst scalars such that each
viewport scaled by that amount is still a subrange of [-32767, 32767].

The old code has a couple of issues:
1) It used scissor instead of viewport_scissor, potentially taking into
   account a viewport that is too small and therefore selecting a scale
   that is too large.
2) Merging the viewports isn't ideal, as for example viewports with
   boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k,
   while their union [0, 1001] only allows a scale of ~32.

The new code just determines the guardband per viewport and takes the minimum.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-04-03 23:03:46 +02:00
Bas Nieuwenhuizen
d64f689f61 radv: Enable VK_KHR_incremental_present.
Just enabling the driver-independent implementation that Jason did.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-04-03 23:00:07 +02:00
Jason Ekstrand
0817110969 anv: Implement VK_KHR_incremental_present
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-04-03 13:51:08 -07:00
Jason Ekstrand
be1ecd8c6e vulkan/wsi/wayland: Pass damage through to the compositor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-04-03 13:51:08 -07:00
Jason Ekstrand
f82b6c6272 vulkan/wsi: Plumb present regions through the common code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-04-03 13:51:08 -07:00
Jason Ekstrand
3598a2907c vulkan/wsi: Fix some line wrapping
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2017-04-03 13:51:08 -07:00
Dave Airlie
22b116171f radv: fix interp at sample code.
Interp at sample needs to use the center, since the sample
positions it retrieves are relative to the center.

This fixes a bunch of CTS tests with multisample_interpolation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:55:21 +10:00
Dave Airlie
1171b304f3 radv: overhaul fragment shader sample positions.
The current code was broken, and I decided to redesign it instead.

This puts the sample positions for all samples into the queue
constant descriptor buffer after all the spill/ring descriptors.

It then uses a single offset register to point how far into the
samples the samples for num_samples are. This saves one user sgpr
and means we only generate the sample position data in the rare
single case where we need it currently.

This doesn't fix the failing CTS tests without the followup
fix.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:55:15 +10:00
Lionel Landwerlin
471c1bc7cc aubinator/gen_decoder/i965: decode instructions from dword 0
Some packets like 3DSTATE_VF_STATISTICS, 3DSTATE_DRAWING_RECTANGLE,
3DPRIMITIVE, PIPELINE_SELECT, etc... have configurable fields in
dword0, we probably want to print those.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 20:45:34 +01:00
Lionel Landwerlin
04f2e80257 intel: gen_decoder: store pointer to current decoded field in iterator
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 20:45:34 +01:00
Dave Airlie
1e9e747d00 radv/ac: fix texture derivative ordering
The ordering NIR gives us is correct for the hw, this fixes:
dEQP-VK.glsl.texture_functions.texturegrad.* (mainly trigged
on isampler/usampler 3d textures.).

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:39:10 +10:00
Dave Airlie
303d22f319 radv/ac: round cube array coordinate before fixup.
This fixes:
dEQP-VK.glsl.texture_functions.texture.samplercubearray*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:39:07 +10:00
Dave Airlie
5821f676ee radv: move to using common buffer load format.
Get rid of usage of SI.vs.load.input.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:37:52 +10:00
Brian Paul
b98ec1e920 util: fix MSVC warning in u_align_u32()
To silence
C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary
minus operator applied to unsigned type, result still unsigned

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-03 13:09:05 -06:00
Brian Paul
960f640c7a util: #include "c99_compat.h" to fix Windows build
Otherwise, we were getting the definition for 'inline' by chance from
some other preceeding #include.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-03 13:09:05 -06:00
Brian Paul
0fb2c16b3b util: s/SHA1_H/MESA_SHA1_H/
To follow the convention of other header include guards.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-03 13:09:05 -06:00
Brian Paul
7348df81b8 svga: add comment on svga_buffer_hw_storage_map()
Trivial.
2017-04-03 13:09:05 -06:00
Rhys Kidd
1572d11d89 travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon
Per comments by Travis-CI, the apt addon is only really needed for the
container-based Precise builds, as they don't yet support Trusty on that platform.

Mesa currently uses Trusty fully-virtualized environment (due to sudo: required).

See further:
https://docs.travis-ci.com/user/trusty-ci-environment/#Fully-virtualized-via-sudo%3A-required
https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-216054237

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-04-03 19:43:12 +01:00
Grazvydas Ignotas
a6a38a038b util/u_atomic: provide 64bit atomics where they're missing
There are still some distributions trying to support unfortunate people
with old or exotic CPUs that don't have 64bit atomic operations. When
compiling for such a machine, gcc conveniently inserts a library call to
a helper, but it's implementation is missing and we get a linker error.
This allows us to provide our own implementation, which is marked weak
to prefer a better implementation, should one exist.

v2: changed copyright, some style adjustments
v3: [mattst88] Print results with AC_MSG_CHECKING/AC_MSG_RESULT

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-04-03 10:52:41 -07:00
Rob Clark
70c272004f gallium/util: libunwind support
It's kinda sad that (a) we don't have debug_backtrace support on !X86
and that (b) we re-invent our own crude backtrace support in the first
place.  If available, use libunwind instead.  The backtrace format is
based on what xserver and weston use, since it is nice not to have to
figure out a different format.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-03 11:32:17 -04:00
Rob Clark
c3c884c49c gallium/util: clean up stack frame printing
Prep work for next patch.

Ideally 'struct debug_stack_frame' would be opaque, but it is embedded
in a bunch of places.  But at least we can treat it opaquely.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-04-03 11:32:17 -04:00
Samuel Pitoiset
0c0b29591c st/mesa: add st_convert_image()
Should be used by the state tracker when glGetImageHandleARB()
is called in order to create a pipe_image_view template.

v3: - move the comment to *.c
v2: - make 'st' const
    - describe the function

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 11:29:31 +02:00
Samuel Pitoiset
90534e9dba st/mesa: make 'st' const in st_mesa_format_to_pipe_format()
This avoids a compilation warning since st_convert_image()
requires 'st' to be const.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 11:29:18 +02:00
Bartosz Tomczyk
8d919ba384 mesa/glthread: Call unmarshal_batch directly in glthread_finish
Call it directly when batch queue is empty. This avoids costly thread
synchronisation. This commit improves performance of games that have
previously regressed with mesa_glthread=true.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-03 10:33:31 +10:00
Timothy Arceri
dbdd7231c2 mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled
We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.

V2: don't disable it from with the call itself. We need a custom
    marshalling function or we get stuck waiting for thread to
    finish.

V3: tidy up redundant code copied from generated version.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-03 09:31:11 +10:00
Grazvydas Ignotas
a0f0f3958e amd/addrlib: fix optimized build warnings
All the -Wunused-but-set-variable ones.
Found a way to do it with a oneliner.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 00:48:26 +02:00
Grazvydas Ignotas
8e42038d87 radeonsi: use unreachable to fix a warning
si_state.c: In function ‘si_make_texture_descriptor’:
si_state.c:3240:25: warning: ‘num_format’ may be used uninitialized
si_state.c:3240:12: warning: ‘data_format’ may be used uninitialized

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 00:46:35 +02:00
Constantine Kharlamov
dc6b3c031e r600g: Add more (un)likely functions
1-st is obvious because of assert, 2-nd stolen frmo si_draw_vbo(),
and 3-rd is just a small refactoring.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 00:36:25 +02:00
Constantine Kharlamov
807de52054 r600g: Remove intermediate assignment of pipe_draw_info
It removes a need to copy whole struct every call for no reason.  Comparing
objdump -d output for original and this patch compiled with -O2, shows reduce
of the function by 16 bytes.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 00:36:25 +02:00
Constantine Kharlamov
4408e1ca53 r600g: Use separate index_bias variable
Needed to get rid of a separate struct allocation in the next patch, because
the one in argument is a constant, and don't allow changing its fields.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-04-03 00:36:25 +02:00
Ilia Mirkin
cb518f2fb2 nv30: fp/rast may be null when validating fb/scissor due to clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-02 11:03:00 -04:00
Ilia Mirkin
1184fba86e nvc0: fragprog may not be set when e.g. clearing
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-02 10:58:32 -04:00
Ilia Mirkin
7a0c1eee0c nv50: don't assume a rast is set when validating for clears
Clears can happen before a rast is set, which can in turn cause scissors
and fragprog to be validated. Make sure that we handle this case.

Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-04-02 10:58:32 -04:00
Dave Airlie
03a67fbbf7 radv: fix order of the guardband register emission.
y is vert, x is horiz.

Noticed in visual inspection compared to radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-02 20:17:30 +10:00
Edward O'Callaghan
f9387a223d mesa/main: Fix memset in formatquery.c
v2: We explicitly set each member to -1 over using a confusing
memset().

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-04-02 15:18:38 +10:00
Samuel Pitoiset
515165ff0e radeonsi: add load_image_desc()
Similar to load_sampler_desc(). Same deal for bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-01 18:07:49 +02:00
Samuel Pitoiset
2f44402386 radeonsi: rework the load_sampler_desc() helpers
Will be more convenient for bindless because the 64bit handle is
actually the base_ptr of the descriptor (ie. 'list' will be fetched
from TGSI_FILE_CONSTANT/TGSI_FILE_TEMPORARY instead).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-04-01 18:07:49 +02:00
Samuel Pitoiset
8a3ef8c65d gallivm: add lp_build_emit_fetch_src() helper
lp_build_emit_fetch() is useful when the source type can be
infered from the instruction opcode.

However, for bindless samplers/images we can't do that easily
because tgsi_opcode_infer_src_type() returns TGSI_TYPE_FLOAT for
TEX instructions, while we need TGSI_TYPE_UNSIGNED64 if the
resource register is bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-01 18:07:49 +02:00
Andres Gomez
8b10bf273d docs: add news item and link release notes for 17.0.1
Signed-off-by: Andres Gomez <agomez@igalia.com>
2017-04-01 18:51:40 +03:00
Andres Gomez
f4d2f3aa30 docs: add sha256 checksums for 17.0.3
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 71d2f05a9e)
2017-04-01 18:50:08 +03:00
Andres Gomez
5fa3f63036 docs: add release notes for 17.0.3
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit 7f34ecae7f)
2017-04-01 18:50:06 +03:00
Erik Faye-Lund
86a9ddfef7 glsl: ir_explog_to_explog2 is no more
Since 63684a9a ("glsl: Combine many instruction lowering passes
into one.", Thu Nov 18 2010), we no longer have anything called
ir_explog_to_explog2. So it's only confusing to have those
references there.

Update with the appropriate method, so people can grep for it in
the current tree if they encounter it.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-01 13:39:52 +02:00
Erik Faye-Lund
99d8b933fd gallium/docs: remove documentation of removed arg
geom was removed in e968975 ("gallium: remove the geom_flags param
from is_format_supported", Tue Mar 8 00:01:58 2011 +0100), but the
documentation of it was left over. Let's bring the documentation up
to date.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-01 13:39:52 +02:00
Erik Faye-Lund
c33807463e st/mesa: avoid aliasing violation in st_cb_perfmon.c
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-01 13:39:52 +02:00
Michal Srb
52f9ccefcb st: Add cubeMapFace parameter to st_finalize_texture.
st_finalize_texture always accesses image at face 0, but it may not be
set if we are working with cubemap that had other face set.

This fixes crash in piglit
same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-04-01 09:03:23 +02:00
Jason Ekstrand
d6fccb4c09 vulkan: Bump the header and XML to the latest public version 2017-03-31 22:41:43 -07:00
Karol Herbst
baaae8cb81 nv50/ir: also do PostRaLoadPropagation for FMA
Helps Feral-ported games, due to their use of fma()

shader-db changes:
total instructions in shared programs : 3934925 -> 3934327 (-0.02%)
total gprs used in shared programs    : 481563 -> 481563 (0.00%)
total local used in shared programs   : 27469 -> 27469 (0.00%)
total bytes used in shared programs   : 36061888 -> 36056504 (-0.01%)

                local        gpr       inst      bytes
    helped           0           0         228         228
      hurt           0           0           0           0

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:16 -04:00
Karol Herbst
7d007824a3 gm107/ir: add LIMM form of mad
v2: renamed commit
    reordered modifiers
    add assert(dst == src2)
v3: reordered modifiers again
v5: no rounding bit for limms

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:15 -04:00
Karol Herbst
ad638514e3 gk110/ir: add LIMM form of mad
v2: renamed commit
    reordered modifiers
    add assert(dst == src2)
v3: removed wrong neg mod emission

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:14 -04:00
Karol Herbst
d346b8588c nv50/ir: implement mad post ra folding for nvc0+
changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=60000 /width=1024 /height=640:

score: 1026 -> 1045

changes for shader-db:
total instructions in shared programs : 3943335 -> 3934925 (-0.21%)
total gprs used in shared programs    : 481563 -> 481563 (0.00%)
total local used in shared programs   : 27469 -> 27469 (0.00%)
total bytes used in shared programs   : 36139384 -> 36061888 (-0.21%)

                local        gpr       inst      bytes
    helped           0           0        3587        3587
      hurt           0           0           0           0

v2: removed TODO
    reorderd to show changes without RA modification
    removed stale debugging print() call
v3: remove predicate checks
    enable only for gf100 ISA

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:13 -04:00
Karol Herbst
d6ce325147 nv50/ir: restructure and rename postraconstantfolding pass
we might want to add more folding passes here, so make it a bit more generic

v2: leave the comment and reword commit message
v4: rename it to PostRaLoadPropagation

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:12 -04:00
Karol Herbst
f2a4d881fe nvc0/ir: also do ConstantFolding for FMA
Helps mainly Feral-ported games, due to their use of fma()

shader-db changes:
total instructions in shared programs : 3941587 -> 3940749 (-0.02%)
total gprs used in shared programs    : 481511 -> 481460 (-0.01%)
total local used in shared programs   : 27469 -> 27481 (0.04%)
total bytes used in shared programs   : 36123344 -> 36115776 (-0.02%)

                local        gpr       inst      bytes
    helped           2          48         243         243
      hurt           2           3          32          32

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:10 -04:00
Karol Herbst
fac921db63 nvc0/ir: disable support for LIMMs on MAD/FMA
I hit an assert in the emiter while toying around with optimizations, because
ConstantFolding immediated a big int into a mad.

There is special handling for FMA/MAD in insnCanLoad, which is broken. With
this patch the special path should be not hit anymore. Anyway, the constraints
for the LIMMS can't be guarenteed in SSA form and I have patches pending to
use it via a post-SSA optimization pass.

As a result, immediates get immediated for int mad/fmas as well.

changes in shader-db:
total instructions in shared programs : 3943335 -> 3941587 (-0.04%)
total gprs used in shared programs    : 481563 -> 481511 (-0.01%)
total local used in shared programs   : 27469 -> 27469 (0.00%)
total bytes used in shared programs   : 36139384 -> 36123344 (-0.04%)

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: remove extra bit from insnCanLoad as well]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 23:57:08 -04:00
Lyude
31970ab9a6 nvc0: Add support for NV_fill_rectangle for the GM200+
This enables support for the GL_NV_fill_rectangle extension on the
GM200+ for Desktop OpenGL.

Signed-off-by: Lyude <lyude@redhat.com>

Changes since v1:
- Fix commit message
- Add note to reldocs
Changes since v2:
- Remove unnessecary parens in nvc0_screen_get_param()
- Fix sorting in release notes
- Don't execute FILL_RECTANGLE method on pre-GM200+ GPUs

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:36 -04:00
Lyude
82e0c5f484 st/mesa: Add support for NV_fill_rectangle
Signed-off-by: Lyude <lyude@redhat.com>

Changes since v1:
- Fix commit name

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:32 -04:00
Lyude
1cc7352c4c gallium: Add NV_fill_rectangle to pipe state
Signed-off-by: Lyude <lyude@redhat.com>

Changes since v1:
- Fix accidental widening of bitfields

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:29 -04:00
Lyude
ffe2bd676f gallium: Add a cap to check if the driver supports fill_rectangle
Changes since v1:
- Add pipe caps for etnaviv, freedreno, swr and virgl

Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:24 -04:00
Lyude
54af467334 mesa: Add support for GL_NV_fill_rectangle
Since we don't have the bits required to support this in OpenGLES yet,
this only enables support for Desktop OpenGL

Signed-off-by: Lyude <lyude@redhat.com>

Changes since v1:
- Simply _mesa_PolygonMode() a little bit
- Fix formatting in OpenGL spec excerpts
- Move polygon mode checking into _mesa_valid_to_render()
Changes since v3:
- Improve error message for invalid drawings with GL_FILL_RECTANGLE_NV

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:20 -04:00
Lyude
a7cb2b40ed glapi: Add GL_NV_fill_rectangle
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-31 21:41:08 -04:00
Marek Olšák
150736b5c3 gallium: remove support for predicates from TGSI (v2)
Neved used.

v2: gallivm: rename "pred" -> "exec_mask"
    etnaviv: remove the cap
    gallium: fix tgsi_instruction::Padding

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-01 00:06:41 +02:00
Dave Airlie
c011fe7452 radv: enable tessellation shaders.
This enables tessellation shaders and sets some values for
the maximums.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:17:25 +10:00
Dave Airlie
cb1518e96b radv/ac: setup lds for tessellation
This seems to get lost in the rebases, should fix
the tessellation demos, crash in llvm.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:17:15 +10:00
Dave Airlie
3f0d69af20 radv: add ia_multi_vgt_param tessellation support.
This just ports the relevant radeonsi pieces.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:17:08 +10:00
Dave Airlie
b4495b71c6 radv/cmd: emit tessellation state.
This emits the tessellation shaders and state to the command stream.

It contains the logic to emit the LS/HS shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:57 +10:00
Dave Airlie
60fc0544e0 radv/pipeline: handle tessellation shader compilation
So tess shaders have some circular dependencies,

TCS needs the TES primitive mode
TES needs the TCS vertices out

This builds the nir for each shader first to get the
info, executes a tes specific nir pass, then builds
the LLVM shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:51 +10:00
Dave Airlie
aaabdd6bc6 radv/ac: handle writing out tess factors.
This ports the code from radeonsi to build the if/endif,
and ports the tess factor emission code. This code has
an optimisation TODO that we can deal with later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:47 +10:00
Dave Airlie
94f9591995 radv/ac: add support for TCS/TES inputs/outputs.
This adds support for the tessellation inputs/outputs to the
shader compiler, this is one of the main pieces of the patch.

It is very similiar to the radeonsi code (post merge we should
consider if there are better sharing opportunities). The main
differences from radeonsi, is that we can have "compact" varyings
for clip/cull/tess factors, and we have to add special handling
for these.

This consists of treating the const index from the deref different
depending on the compactness.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:42 +10:00
Dave Airlie
5ab1289b48 radv/ac: add clip support for tess eval shader.
As this may be the last shader to emit clip distances.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:37 +10:00
Dave Airlie
326b9bc6dc radv/ac: hook up tessellation intrinsics.
This just adds support for the nir intrinsics that tessellation uses.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:32 +10:00
Dave Airlie
d8ab71b207 radv/ac: hook up shader information handling for tessellation
This hooks up the tessellation shader info to the nir values
and ctx generated ones.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:27 +10:00
Dave Airlie
4c60c68bd1 radv/pipeline: start calculating tess stage.
This calculates the pipeline state for tessellation.

It moves the gs ring calculation down to below
where the tessellation shaders will be compiled,
as it needs the info from those shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:19 +10:00
Dave Airlie
823b55a8a9 radv: add tessellation support to variant code.
This just fills out the rsrc registers for tess shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:14 +10:00
Dave Airlie
f239f59778 radv: add tessellation support to shader naming
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:08 +10:00
Dave Airlie
5b40eab00a radv: add tess ctrl stage barrier workaround for SI.
This just ports the workaround from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:04 +10:00
Dave Airlie
3a633cc2cb radv/ac: add support for patch inputs to unique index code.
This add support for tessellation patch inputs to the code
that finds the unique parameter index.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:57 +10:00
Dave Airlie
aeb49bc2b9 radv: port polaris vgt vertex reuse workaround.
This ports the VGT_VERTEX_REUSE register settings
for Polaris GPUs from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:51 +10:00
Dave Airlie
46a820b383 radv: configure tessellation distribution register.
This just takes the radeonsi values.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:45 +10:00
Dave Airlie
60326a7afc radv/ac: setup tessellation shader inputs.
This just configures all the register inputs for the tessellation
related stages.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:41 +10:00
Dave Airlie
3968162751 radv/ac: setup tess rings on compiler side.
This just sets up the necessary pointers on the compiler
side for the rings needed for tessellation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:35 +10:00
Dave Airlie
46e52df34d radv: add tessellation ring allocation support. (v2)
This patch adds support for the offchip rings for storing
tessellation factors and attribute data.

It includes the register setup for the TF ring

v2: always do tess ring size calcs (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:30 +10:00
Dave Airlie
bbfb62df16 radv: add support for some device specific tess information.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:26 +10:00
Dave Airlie
2b3c4bcc1f radv/ac: add tess changes to shader keys/info
This adds the tess pieces for shader keys and shader info,
it adds the necessary bits to the vertex key/info as well.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:22 +10:00
Dave Airlie
a4b039db04 radv: add tess shader stage user data support.
This just adds support for tess to the shader stage conversion
and emits the per-stage descriptors/constants for tess stages.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:15 +10:00
Dave Airlie
a5136a97f7 radv: use defines for ring descriptor offsets.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:12 +10:00
Dave Airlie
0604284e3f radv: add helper function to denote if tess is enabled on a pipeline.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:14:59 +10:00
Dave Airlie
97e0ff30c0 radv: handle clip dist in es outputs.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:14:53 +10:00
Dave Airlie
6279646306 radv: drop unneeded start
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:14:39 +10:00
Dave Airlie
a58d03a5a2 radv: fixup geometry clip emission since using the geom pass
Fixes: 2b35b60d: radv: move to using nir clip/cull merge pass.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:14:38 +10:00
Marek Olšák
744317c9d2 radeonsi/gfx9: allow CMASK fast clear with RB+
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
ea59521475 radeonsi/gfx9: don't compare src_va w/ dst_va for CP_DMA_CLEAR
src_va contains the clear value in this case.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
e3cb67dc6b radeonsi/gfx9: fix 1D array fetches with derivs, bias, or Z compare value
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
6ab2042761 radeonsi/gfx9: fix and enable single-sample CMASK fast clear
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
d4bb4583b0 radeonsi/gfx9: fix and enable MSAA compression
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
06d725ab2f radeonsi/gfx9: disable CE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
35aaccaf81 radeonsi/gfx9: fix linear mipmap CPU access
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
322eb13f09 radeonsi: add tests verifying that VM faults don't hang
GFX9 hangs instead of writing VM faults to dmesg.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
283c31afa1 radeonsi: unify HS max_offchip_buffers workarounds
Vulkan doesn't set more than 508.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:57 +02:00
Marek Olšák
829bd77235 radeonsi: adjust checking for SC bug workarounds
no change in behavior, just making sure that no later chips will use
the workarounds

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 21:41:56 +02:00
Brian Paul
2936f5c37e glsl: use -O1 optimization for builtin_functions.cpp with MinGW
Some versions of MinGW-w64 such as 5.3.1 and 6.2.0 produce bad code
with -O2 or -O3 causing a random driver crash when running programs
that use GLSL.  Most Mesa demos in the glsl/ directory trigger the
bug, but not the fragcoord.c test.

Use a #pragma to force -O1 for this file for later MinGW versions.
Luckily, this is basically one-time setup code.  I suspect the bug
is related to the sheer size of this file.

This should let us move to newer versions of MinGW-w64 for Mesa.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 13:36:25 -06:00
Brian Paul
15bb0511d6 tnl: remove unused var to silence warning
Trivial.
2017-03-31 13:30:54 -06:00
Neha Bhende
2e24a11f1d st/wgl: Replace variable name hdc with hDrawDC
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-03-31 13:30:54 -06:00
Brian Paul
7d0aac2392 st/wgl: add support for WGL_ARB_make_current_read
This adds the wglMakeContextCurrentARB() and wglGetCurrentReadDCARB()
functions.

Signed-off-by: Brian Paul <brianp@vmware.com>
2017-03-31 13:30:54 -06:00
Brian Paul
7753f040fa stw/wgl: add null context check in wglBindTexImageARB()
To avoid dereferencing a null pointer in case wglMakeCurrent() wasn't
called.  Found while debugging SWKOTOR game.

Reviewed-by: Neha Bhende <bhenden@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-03-31 13:30:53 -06:00
Marek Olšák
7d2fa8dc10 radeonsi: decompress DCC in set_sampler_view instead of create_sampler_view (v2)
v2: don't add a new decompress helper function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 20:57:53 +02:00
Marek Olšák
8c7d1ded19 radeonsi: decompress DCC in set_framebuffer_state instead of create_surface (v2)
for threaded gallium, which can't use pipe_context in create_surface

v2: don't add a new decompress helper function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 20:57:53 +02:00
Nicolai Hähnle
d10fbe5159 st/glsl_to_tgsi: fix 64-bit integer bit shifts
Fix a bug that was caused by a type mismatch in the shift count between
GLSL and TGSI. I briefly considered adjusting the TGSI semantics, but
since both LLVM and AMD GCN require both arguments to be of the same type,
it makes more sense to keep TGSI as-is -- it reflects the underlying
implementation better.

I'm also sending out piglit tests that expose this error.

v2: use the right number of components for the temporary register

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 18:15:50 +02:00
Nicolai Hähnle
c22841d8d2 tgsi: fix printing of 64-bit integer immediates
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 18:15:47 +02:00
Lionel Landwerlin
74a80d579d intel: genxml: fix out of tree builds
v2: use Emil's recommendation
    change rule to closer to genxml/genX_bits.h

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-31 15:29:57 +01:00
Thomas Hellstrom
18e2aa063c gbm/dri: Check dri extension version before flush after unmap
The commit mentioned below required the __DRI2FlushExtension to have
version 4 or above, for GBM functionality. That broke GBM with some
classic dri drivers. Relax that requirement so that we only flush
after unmap if we have version 4 or above. Drivers that require the flush
for correct functionality should implement the desired version.

Fixes: ba8df228 ("gbm/dri: Flush after unmap")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Dylan Baker <dylan@pnwbakers.com>
2017-03-31 10:25:46 +02:00
Nicolai Hähnle
02112c3ef7 radeonsi: implement ARB_shader_group_vote
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:27 +02:00
Nicolai Hähnle
cd3f386069 radeonsi: enable ARB_shader_clock
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:27 +02:00
Nicolai Hähnle
2290535d62 radeonsi: emit TGSI_OPCODE_CLOCK
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:26 +02:00
Nicolai Hähnle
65b542a7cc st/mesa: implement ARB_shader_clock
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:26 +02:00
Ilia Mirkin
94ec847cb0 tgsi: add CLOCK opcode
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:26 +02:00
Nicolai Hähnle
d0c7f924a3 gallium: add PIPE_CAP_TGSI CLOCK
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:25 +02:00
Nicolai Hähnle
44125b29d1 glsl: fix clockARB builtin function
The underlying intrinsic is defined to always have a uvec2 return type.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 07:56:25 +02:00
Tapani Pälli
3535b87a1a anv: change BLOCK_POOL_MEMFD_SIZE to 1GB
This allows us to run 32bit Vulkan apps on Android, ftruncate
call would fail on 2GB (max size being 2GB - 1).

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-31 08:43:28 +03:00
Tapani Pälli
2398770c87 android: add libmesa_genxml as dep to libmesa_isl
This is to fix following compile error with libmesa_isl:
   mesa/src/intel/isl/isl.c:28:10: fatal error: 'genxml/genX_bits.h' file not found

Fixes: f0eaf38 ("genxml: New generated header genX_bits.h (v6)")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emli Velikov <emil.velikov@collabora.com>
2017-03-31 08:42:54 +03:00
Timothy Arceri
3e524cfa47 mesa: remove MESA_GLSL=opt
This is unused.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 13:43:38 +11:00
Timothy Arceri
2caa3aa1f4 mesa: remove MESA_GLSL=no_opts env option
This is confusing because is only applys to GL_ARB_vertex/fragment_program,
and because of that its also not very useful.

If someone requires this for debugging they can just make an ad-hoc
code change.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 13:43:38 +11:00
Timothy Arceri
94224950dd mesa: move FLUSH_VERTICES() call to meta
There is no need for this to be in the common code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 13:43:38 +11:00
Timothy Arceri
2e70de7d2f mesa/vbo: remove redundant _mesa_is_bufferobj() calls
This is already called inside the vbo_exec_vtx_{unmap,map}()
functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-31 11:54:37 +11:00
Timothy Arceri
3ef1ff6270 mesa/glthread: add async support to ARB_gpu_shader_int64 uniform functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 11:54:36 +11:00
Timothy Arceri
eb3df0e838 mesa/glthread: add async support to ARB_gpu_shader_fp64 uniform functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-31 11:54:35 +11:00
Lionel Landwerlin
469da094e1 aubinator: enable snb/ilk through --gen
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-31 01:25:33 +01:00
Lionel Landwerlin
0f83c05149 intel: genxml: compress all gen files into one
Combining all the files into a single string didn't make any
difference in the size of the aubinator binary.

With this change we now also embed gen4/4.5/5 descriptions, which
increases the aubinator size by ~16Kb.

v2 (Lionel): rebase makefiles

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-31 01:24:56 +01:00
Bas Nieuwenhuizen
0f3de89a56 radv: Use the guard band.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen
8a53e6e4c5 radv: Prepare for not using the guard band for lines & points.
Vulkan Clipping is defined in terms of vertices, the scissor based
clipping happens on pixels. There is a difference with points and
lines, as a vertex can be outside the viewport while some pixels are in.
On Vulkan thoise pixels shouldn't be drawn, while they would be with
the guardband.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen
76603aa90b radv: Drop the default viewport when 0 viewports are given.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen
4083a2ddcb radv: Set proper viewport & scissor for meta draws.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Lyude
42f2bccd11 mesa: Fix trailing whitespace in polygon.c
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-30 11:59:51 -07:00
Lyude
043ee96059 mesa: Fix gross indenting in _mesa_PolygonMode()
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-30 11:59:51 -07:00
Lyude
a1ce8a3fe2 r300: Fix indenting in r300_get_param()
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-30 11:59:51 -07:00
Lyude
e5c6c421c4 vc4: Fix indenting in vc4_screen_get_param()
Signed-off-by: Lyude <lyude@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-30 11:59:51 -07:00
Kenneth Graunke
e113dfabad intel: Add INTEL_CFLAGS to aubinator CFLAGS.
It still needs intel_aub.h.  Fixes the build.
2017-03-30 11:58:00 -07:00
Jason Ekstrand
fbcf92a278 nir: Add support for 8 and 16-bit types
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Jason Ekstrand
28e41506a6 nir/constant_expressions: Don't switch on bit size when not needed
For opcodes such as the nir_op_pack_64_2x32 for which all sources and
destinations have explicit sizes, the bit_size parameter to the evaluate
function is pointless and *should* do nothing.  Previously, we were
always switching on the bit_size and asserting if it isn't one of the
sizes in the list.  This generates way more code than needed and is a
bit cruel because it doesn't let us have a bit_size of zero on an ALU op
which shouldn't need a bit_size.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Jason Ekstrand
b69b44d222 nir/constant_expressions: Pull the guts out into a helper block
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Kenneth Graunke
f5e5c0c101 i965: Stop using legacy dri_bufmgr_* and intel_* names.
Eric renamed these from dri_bufmgr_* and intel_bufmgr_* to drm_intel_*
in libdrm commit 4b9826408f65976a1a13387beda748b65e03ec52, circa 2008,
but we've been using the legacy names this whole time.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-30 11:16:34 -07:00
Emil Velikov
3df993e1a2 intel: automake: move INTEL_CFLAGS as applicable
Only common/decoder.[ch] requires it [for intel_aub.h].

v2: The code was moved to from intel/tools to intel/common,
update accordingly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-30 19:07:28 +01:00
Emil Velikov
4ffb394961 intel: android: remove libdrm_intel requirement
The only part which requires libdrm_intel tools/aubinator is not built
on Android.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-30 19:07:23 +01:00
Marek Olšák
331714d72e Partially revert "amd/addrlib: silence warnings" to fix builds with DEBUG
This partially reverts commit 8a74140a21.
2017-03-30 19:17:39 +02:00
Marek Olšák
681adbc18c ddebug: implement clear_texture
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 18:53:42 +02:00
Marek Olšák
83d3e6fbff radeonsi: fix an unused-variable warning in a release build 2017-03-30 17:22:25 +02:00
Marek Olšák
bb2e05885d vdpau: fix a maybe-uninitialized warning 2017-03-30 17:14:47 +02:00
Marek Olšák
65732a8ff6 softpipe: fix a maybe-uninitialized warning
/home/marek/dev/mesa-main/src/gallium/drivers/softpipe/sp_compute.c:178:
 warning: 'grid_size' may be used uninitialized in this function
 [-Wmaybe-uninitialized]
2017-03-30 17:14:47 +02:00
Marek Olšák
9f5dbbe030 gallivm: fix a maybe-uninitialized warning
/home/marek/dev/mesa-main/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c:3598:
 warning: 'level' may be used uninitialized in this function [-Wmaybe-uninitialized]
       out1 = lp_build_cmp(&leveli_bld, PIPE_FUNC_GREATER, level, last_level);
            ^
2017-03-30 17:14:47 +02:00
Marek Olšák
3b1934d9b6 gallium/radeon: s/dcc_disable/disable_dcc/
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-30 16:09:39 +02:00
Marek Olšák
45a71d5de5 radeonsi: handle incompatible DCC formats in resource_copy_region
Required because of later commits.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2017-03-30 16:09:39 +02:00
Marek Olšák
b05b8587ae radeonsi: remove a workaround for inexact *8_SNORM blits
All tests pass on Fiji now. This prevents DCC disablement due to
incompatible DCC formats due to the fallback.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2017-03-30 16:09:39 +02:00
Marek Olšák
a955ee788f gallium/radeon: add and use a new helper vi_dcc_enabled
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-30 16:09:37 +02:00
Marek Olšák
f7bd51626e gallium/radeon: formalize that r600_query_hw_add_result doesn't need a context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-30 16:09:36 +02:00
Marek Olšák
d76c306162 radeonsi: don't make a copy of pipe_index_buffer in draw_vbo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-30 16:09:32 +02:00
Marek Olšák
abb25fb18e gallium/util: use const in u_index_modify helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-30 16:09:29 +02:00
Samuel Pitoiset
7d99f48b5e winsys/amdgpu: remove AMDGPU_INFO_NUM_EVICTIONS
This is now exposed with libdrm_amdgpu 2.4.76.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 15:27:13 +02:00
Marek Olšák
675af982e1 radeonsi: add Vega10 PCI IDs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Boyuan Zhang
cb8b84e3d0 radeon/uvd: set correct vega10 db pitch alignment
Create new function to get correct alignment based on Asics, and change
the corresponding decode message buffer and dpb buffer size calculations

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
5eba761fee radeon/vce: add vce support for firmware 53.19.4
v2: squashed with other similar commits

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
ed48b399f1 radeon/vce: adapt gfx9 surface to vce
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
6c7870fee8 winsys/surface: add height pitch for gfx9
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
c89e771c9c radeon/uvd: clear message buffer when reuse
As required by firmware

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
c836f2ce28 radeon/uvd: adapt gfx9 surface to uvd
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Leo Liu
9d5db4e8f4 radeon/uvd: add uvd soc15 register
Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
474468fbf9 radeonsi/gfx9: disable features that don't work
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
8ea3da0706 radeonsi/gfx9: only allow GL 3.1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
7695ea0c02 radeonsi/gfx9: add linear address computations for texture transfers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
172b05a37e radeonsi/gfx9: don't generate LS and ES states
these shaders don't exist on GFX9

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
eb22f5bf6f radeonsi/gfx9: SPI_SHADER_USER_DATA changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
f4ab7a5415 winsys/amdgpu: set/get BO tiling flags for GFX9
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
7d88233f84 radeonsi/gfx9: handle pitch and offset overrides for texture_from_handle
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
de55e57e29 radeonsi/gfx9: set/validate GFX9 BO metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
bd1da6b339 radeonsi/gfx9: add radeon_surf.gfx9.surf_offset
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
3685a12bad radeonsi/gfx9: don't write mipmap level offsets to BO metadata
GFX9 doesn't have (usable) mipmap offsets.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
9c100bd693 radeonsi/gfx9: flush CB & DB caches with an EOP TS event
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
6e0d64712a radeonsi/gfx9: use ACQUIRE_MEM
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
81aa21d732 radeonsi/gfx9: only use CE RAM for most-used descriptors
because the CE RAM size decreased to 4 KB.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
86f13c7363 radeonsi/gfx9: emit FLUSH_DFSM where required
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
ad93d72c34 radeonsi/gfx9: emit BREAK_BATCH in emit_framebuffer_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
405bacd820 radeonsi/gfx9: fix MIP0_WIDTH & MIP0_HEIGHT for compressed texture blits
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
354285afa0 radeonsi/gfx9: fix textureSize/imageSize for 1D textures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
566defad13 radeonsi/gfx9: add a workaround for 1D depth textures
The same workaround is used by Vulkan.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
fc3c503b5d radeonsi/gfx9: enable clamping for Z UNORM formats promoted to Z32F
so that shaders don't have to do it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
5abf60076c radeonsi/gfx9: image descriptor changes in mutable fields
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
c8ffec4f4b radeonsi/gfx9: FMASK image descriptor changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
d60f72a9f0 radeonsi/gfx9: image descriptor changes in immutable fields
The border color swizzle logic was copied from Vulkan. It doesn't make any
sense to me, but it passes all piglits except the stencil ones.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
dfd2b54948 radeonsi/gfx9: DB changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
94819a3e6c radeonsi/gfx9: CB changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
272b50a6f4 radeonsi/gfx9: do DCC clears on non-mipmapped textures only
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
aba8e0ea68 radeonsi/gfx9: update can_sample_z/s flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
054dcbe42c radeonsi/gfx9: pass correct parameters to buffer_get_handle
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
710aaed52b radeonsi/gfx9: update si_set_optimal_micro_tile_mode
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
7fcad40ca5 radeonsi/gfx9: don't check array_mode for allowing TC-compatible HTILE
GFX9 supports this with all modes except linear.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
6f09b0d076 radeonsi/gfx9: update HTILE/CMASK/FMASK allocators
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
281542c690 radeonsi/gfx9: stub testdma - array_mode_to_string
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
a0e8b73594 radeonsi/gfx9: update r600_print_texture_info
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
b25d7c2cbf gallium/radeon: move pre-GFX9 radeon_bo_metadata.* to u.legacy.*
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
9b365d497a winsys/amdgpu: set num_tile_pipes, pipe_interleave_bytes for GFX9
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
493de7f935 winsys/amdgpu: wire up new addrlib for GFX9
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
e572835fea winsys/amdgpu: update amdgpu_addr_create for GFX9
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
a71139470c winsys/amdgpu: rename GFX6 surface functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
9ca33ab78e gallium/radeon: add GFX9 surface info to radeon_surf
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
ba2e7c68ce gallium/radeon: move pre-GFX9 radeon_surf.* members to radeon_surf.u.legacy.*
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
641b79774a radeonsi/gfx9: allow Z16_UNORM for TC-compatible HTILE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
a4f0a1099f radeonsi/gfx9: draw changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
b39fade67c radeonsi/gfx9: pad shader binaries by 128 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
5271d12a6e radeonsi/gfx9: trivial shader and ring changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
0aae4f4764 radeonsi/gfx9: sampler state changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
71eca0780a radeonsi/gfx9: add a scissor bug workaround
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
b576df4017 radeonsi/gfx9: rasterizer changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
be8eba0625 radeonsi/gfx9: disable the 2-bit format fetch fix
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
31b1042276 radeonsi/gfx9: set NUM_RECORDS correctly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
5f4659260e radeonsi/gfx9: ELEMENT_SIZE change
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
d214b95e9a radeonsi/gfx9: enable ETC2
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
6d21fd51b6 radeonsi/gfx9: disable RB+ on Vega10
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
2862300d9e radeonsi/gfx9: init_config changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
b054718218 radeonsi/gfx9: don't set PA_SC_RASTER_CONFIG*
The registers don't exist on GFX9.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
de7967a27a radeonsi/gfx9: Gather4 no longer needs the workaround
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
71ad666414 radeonsi/gfx9: CP DMA changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
7690196135 radeonsi/gfx9: query changes - EVENT_WRITE and SET_PREDICATION
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
ea9cf0a322 radeonsi/gfx9: EVENT_WRITE_EOP -> RELEASE_MEM
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
3e3d4f5e1d radeonsi/gfx9: INDIRECT_BUFFER change
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
9680a75489 radeonsi/gfx9: enable SDMA buffer copying & clearing
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
c9b004af58 radeonsi/gfx9: handle GFX9 in a few places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
92112ec296 radeonsi/gfx9: don't read back non-existent SRBM registers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
ef97cc0cae radeonsi/gfx9: add IB parser support
Both GFX6 and GFX9 fields are printed next to each other in parsed IBs.

The Python script parses both headers like one stream and tries to merge
all definitions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
9338ab0afd radeonsi/gfx9: set the LLVM processor, require LLVM 5.0
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
68d6d097f1 radeonsi/gfx9: add GFX9 and VEGA10 enums
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
5691e14735 amd: GFX9 packet changes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
ecbdfbeb05 amd: define event types for GFX9
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
00e777b61c amd: add texture format definitions for GFX9
the DATA_FORMAT and NUM_FORMAT fields are the same, but some of the enums
differ, thus add GFX6 and GFX9 suffixes, so that the IB parser can show
enums for both.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
e6c520362d amd: resolve remaining definition conflicts with gfx9d.h
Add _GFX6 and _GFX9 suffixes to conflicting definitions.

sid.h and gfx9d.h can now be included in the same file.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
7e7043c31c amd: normalize register definition formatting
This resolves trivial conflicts with gfx9d.h caused by different formatting.
Some fields are also renamed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
db04d4ccaa amd: import GFX9 register definitions
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
a3556c0f06 radeonsi: code shuffling in si_init_depth_surface
use fewer local variables, re-order the assignments, so that the GFX9 diff
is smaller here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
8a74140a21 amd/addrlib: silence warnings 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
7f160efcde amd/addrlib: import gfx9 support 2017-03-30 14:44:33 +02:00
Kevin Furrow
047d6daf10 amd/addrlib: Not all ETC2 formats are 128bpp... add new ETC2 formats to differentiate between 64 and 128bpp formats. 2017-03-30 14:44:33 +02:00
Kevin Furrow
1360018c1c amd/addrlib: Fix selection of swizzle modes for 3D compressed images. 2017-03-30 14:44:33 +02:00
Kevin Furrow
9705e3b72c amd/addrlib: Add support for ETC2 and ASTC formats. 2017-03-30 14:44:33 +02:00
Joe Ma
a489cdb20f amd/addrlib: Bump version to 6.02 2017-03-30 14:44:33 +02:00
Frans Gu
e736edf63d amd/addrlib: Adjust slie size after pitch and actual height adjustment 2017-03-30 14:44:33 +02:00
Frans Gu
588e5bbf3d amd/addrlib: Apply input pitch after internal pitch aligning 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
11f1306207 amdgpu/addrlib: Bump version to 6.01
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
a136926eef amdgpu/addrlib: Seperate 2 dcc related workarounds by different flags
1) dccCompatible for padding MSAA surface to support fast clear
2) dccPipeWorkaround for padding surface to support dcc
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
48bf5d0800 amdgpu/addrlib: Fix the issue that tcCompatible HTILE slice size is not calculated correctly 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
33c25655c1 amdgpu/addrlib: Add a new output flag to notify client that the returned tile index is for PRT on SI
If this flag is set for mip0, client should set prt flag for sub mips,
so that address lib can select the correct tile index for sub mips.
2017-03-30 14:44:33 +02:00
Xavi Zhang
fa906a888b amdgpu/addrlib: add matchStencilTileCfg and tcCompatible fixes
The usage should be client first call AddrComputeSurfaceInfo() on
depth surface with flag "matchStencilTilecfg", AddrLib will use
2DThin1 tile index for depth as much as possible and do not down grade
unless alignment requirement cannot be met.

1. If there is a matched 2DThin1 tile index for stencil which make
sure they will share same tile config parameters, then return the
stencil 2DThin1 tile index as well.
2. If using 2DThin1 tile mode cannot make sure such thing happen, and
TcCompatible flag was set, then ignore this flag then try 2DThin1 tile
mode for depth and stencil again.
3. If 2DThin1 tile mode cannot make sure depth and stencil to have
same tile config parameters, then down grade depth surface tile mode
to 1DThin1.
4. If depth surface's tile mode was 1DThin1, then return 1DThin1 tile
index for stencil.
5. If depth surface's tile mode is PRT, then return invalid tile index
to stencil since their tile config parameters will never be met.

Client driver then check the returned tile index of stencil -- if it
is not invalid tile index, then call AddrComputeSurfaceInfo() on
stencil surface with the returned stencil tile index to get full
output information. Please note, client needs to set flag
"useTileIndex" when AddrLib get created.
2017-03-30 14:44:33 +02:00
Frans Gu
6764d96eaa amdgpu/addrlib: Adjust bank equation bit order based on macro tile aspect ratio settings
By this way, we can have valid equation for 2D_THIN1 tile mode.
Add flag "preferEquation" to return equation index without adjusting
input tile mode.
2017-03-30 14:44:33 +02:00
Frans Gu
ed1aca8e8f amdgpu/addrlib: do some tile mode conversions to display surface 2017-03-30 14:44:33 +02:00
Xavi Zhang
cb8844392c amdgpu/addrlib: Check prt flag for PRT_THIN1 extra padding for DCC. 2017-03-30 14:44:33 +02:00
Frans Gu
fe216415c6 amdgpu/addrlib: Add new flags minimizePadding and maxBaseAlign
1) minimizePadding - Use 1D tile mode if padded size of 2D is bigger
than 1D
2) maxBaseAlign - Force PRT tile mode if macro block size is bigger than
requested alignment.

Also, related changes to tile mode optimization for needEquation.
2017-03-30 14:44:33 +02:00
Xavi Zhang
4dd4700612 amdgpu/addrlib: Always returns pixelPitch in original pixels 2017-03-30 14:44:33 +02:00
Sabre Shao
eb3036ed46 amdgpu/addrlib: fix crash on allocation failure 2017-03-30 14:44:33 +02:00
Frans Gu
680f91e5d4 amdgpu/addrlib: Add flag to report if a surface can have dcc ram 2017-03-30 14:44:33 +02:00
Roy Zhan
ca88f83222 amdgpu/addrlib: support non-power2 height alignment (for linear surface) 2017-03-30 14:44:33 +02:00
Frans Gu
c867a2b222 amdgpu/addrlib: Fix family setting for VI and CZ ASICs 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
b328e47d3d amdgpu/addrlib: style cleanup
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
fbc9ba7559 amdgpu/addrlib: Pad pitch to multiples of 256 for DCC surface on Fiji
The change also modifies function CiLib::HwlPadDimensions to report
adjusted pitch alignment.
2017-03-30 14:44:33 +02:00
Xavi Zhang
145750efba amdgpu/addrlib: Fix number of //
Find ^/{80,99}$  and replace them to 100 "/"

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
4e2668ecd1 amdgpu/addrlib: Cleanup.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Xavi Zhang
d1ecb70ba3 amdgpu/addrlib: Use namespaces
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Kevin Zhao
8912862a40 amdgpu/addrlib: Adjust 99 "*" to 100 "*" alignment
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Frans Gu
acaeae2861 amdgpu/addrlib: Add a new tile mode ADDR_TM_UNKNOWN
This can be used by address lib client to ask address lib to select
tile mode.
2017-03-30 14:44:33 +02:00
Xavi Zhang
90029b958e amdgpu/addrlib: Stylish cleanup.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Roy Zhan
554c1b9f2d amdgpu/addrlib: Disable tcComaptible when depth surface is not macro tiled
Experiment show 1D tiling + TcCompatible cannot work together.
2017-03-30 14:44:33 +02:00
Xavi Zhang
120a5d0e42 amdgpu/addrlib: fix pixel index calculation of thick micro tiling 2017-03-30 14:44:33 +02:00
Xavi Zhang
199912a9bc amdgpu/addrlib: Add a flag to skip calculate indices
This is useful for debugging and special cases for stencil surfaces
do not require texture fetch compatible.
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
10f7d1cb03 amdgpu/addrlib: add equation generation
1. Add new surface flags needEquation for client driver use to force
the surface tile setting equation compatible. Override 2D/3D macro
tile mode to PRT_* tile mode if this flag is TRUE and num slice > 1.
2. Add numEquations and pEquationTable in ADDR_CREATE_OUTPUT structure
to return number of equations and the equation table to client driver
3. Add equationIndex in ADDR_COMPUTE_SURFACE_INFO_OUTPUT structure to
return the equation index to client driver

Please note the use of address equation has following restrictions:
1) The surface can't be splitable
2) The surface can't have non zero tile swizzle value
3) Surface with > 1 slices must have PRT tile mode, which disable
slice rotation
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
3e44337bd6 amdgpu/addrlib: rename ComputeSurfaceThickness to Thickness 2017-03-30 14:44:33 +02:00
Xavi Zhang
79dcda5116 amdgpu/addrlib: add define HAVE_TSERVER 2017-03-30 14:44:33 +02:00
Frans Gu
7293a020bd amdgpu/addrlib: Add new interface to support macro mode index query 2017-03-30 14:44:33 +02:00
Roy Zhan
c16e1e2041 amdgpu/addrlib: add explicit Log2NonPow2 function 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
4a4b7da141 amdgpu/addrlib: Fix invalid access to m_tileTable
Sometimes client driver passes valid tile info into address library,
in this case, the tile index is computed in function
HwlPostCheckTileIndex instead of CiAddrLib::HwlSetupTileCfg.
We need to call HwlPostCheckTileIndex to calculate the correct tile
index to get tile split bytes for this case.
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
9e40e09089 amdgpu/addrlib: add ADDR_ANALYSIS_ASSUME
It helps fix analysis warnings in MSC.
2017-03-30 14:44:33 +02:00
XiaoYuan Zheng
6164f23a91 amdgpu/addrlib: add tcCompatible htile addr from coordinate support. 2017-03-30 14:44:33 +02:00
Carlos Xiong
3bd1380ab2 amdgpu/addrlib: force all zero tile info for linear general. 2017-03-30 14:44:33 +02:00
Nicolai Hähnle
8b110f0319 amdgpu/addrlib: Add a member "bpp" for input of method AddrConvertTileIndex and AddrConvertTileInfoToHW
When clients queries tile Info from tile index and expects accurate
tileSplit info,  bits per pixel info is required to be provided since
this is necessary for computing tileSplitBytes; otherwise Addrlib will
return value of "tileBytes" instead if bpp is 0 - which is also
current logic. If clients don't need tileSplit info, it's OK to pass
bpp with value 0.
2017-03-30 14:44:33 +02:00
Frans Gu
ca6a38fd6a amdgpu/addrlib: Refine the PRT tile mode selection
Switch the tile index based on logic instead of hardcoded threshold
for different ASIC.
2017-03-30 14:44:33 +02:00
Xavi Zhang
2bf243f7c6 amdgpu/addrlib: add dccRamSizeAligned output flag
This flag indicates to the client if this level's DCC memory is aligned
or not. No aligned means there are padding to the end.
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
e443b48966 amdgpu/addrlib: Change comment alignment
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
e06aeaf19f amdgpu/addrlib: style changes and minor cleanups
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
cb5d22a3f3 amdgpu/addrlib: AddrLib inheritance refactor
Add one more abstraction layer into inheritance system.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
52a1288a15 amdgpu/addrlib: rearrange code in preparation of refactoring
No code changes.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Xavi Zhang
f12d430c59 amdgpu/addrlib: add disableLinearOpt flag 2017-03-30 14:44:33 +02:00
Xavi Zhang
b5d8120a07 amdgpu/addrlib: Add GetMaxAlignments 2017-03-30 14:44:33 +02:00
Xavi Zhang
3c3d620cf3 amdgpu/addrlib: Let Kaveri go general stereo right eye offset padding path
Kaveri (2-pipe) macro tiling mode table was initially set to all
4-aspect-ratio so the swizzling path did not work for it and then we
chose to pad the offset. We now discover the root cause is that if
ratio > 2, the swizzling path does not work. So we can safely use the
same path for Kaveri.
2017-03-30 14:44:33 +02:00
Xavi Zhang
3614999878 amdgpu/addrlib: Rewrite tile mode optmization code
Note: remove reference to degrade4Space and use opt4Space instead.
2017-03-30 14:44:33 +02:00
Carlos Xiong
c12e35065a amdgpu/addrlib: Add a flag "tcCompatible" to surface info output structure.
Even if surface info input flag "tcComaptible" is enabled, tc
compatible may be not supported if tile split happens for depth
surfaces. Add a new flag in output structure to notify client to
disable tc compatible in this case.
2017-03-30 14:44:33 +02:00
Xavi Zhang
2ffb30c2af amdgpu/addrlib: Make comments shorter
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
XiaoYuan Zheng
3c7bd4e013 amdgpu/addrlib: add new flag nonSplit
Flag tcCompatible has different usage in CI and VI. Add a new flag
"nonSplit" for CI.
2017-03-30 14:44:33 +02:00
Xiao-Tao Zai
47de94a794 amdgpu/addrlib: allow tileSplitBytes greater than row size
Carrizo row size is 1K, while tileSplitBytes is 2K for a 4xAA 32bpp
depth surface. Remove the sanity check that tileSplitBytes must be
greater than row size. There could be performance loss but may be
covered by non-split depth which enables tc-compatible read.
2017-03-30 14:44:33 +02:00
Carlos Xiong
d52e0bbfe6 amdgpu/addrlib: Change to compute TC compatible stencil info
Change the logic to compute tc compatible stencil info via depth's
tileIndex instead of using depth's tileInfo. So the clients can get
the stencil's tileInfo computed from macroModeTable. If the stencil
tileInfo is same as depth tileInfo, then stencil is tc compatible;
otherwise, stencil is not tc compatible. The current suggestion is to
create another stencil buffer with the tc compatible tileInfo, use
depth-to-color copy to decompress and tile convert the rendered
stencil to tc compoatible stencil (And use the new stencil buffer to
program TC).
2017-03-30 14:44:33 +02:00
Nicolai Hähnle
6c65f256e2 amdgpu/addrlib: rename SiAddrLib/CiAddrLib to match internal spelling
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:44:33 +02:00
Marek Olšák
6e44087e77 configure.ac: require libdrm_amdgpu 2.4.76 for Vega
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 14:42:06 +02:00
Samuel Pitoiset
e7850bb7f0 st/glsl_to_tgsi: use glsl_type::sampler_index()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 13:15:34 +02:00
Samuel Pitoiset
784d3a7066 glsl: allow glsl_type::sampler_index() with images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 13:15:16 +02:00
Nicolai Hähnle
257ee3f7ef st/mesa: improve error messages and fix security warning
Debian, Ubuntu set default build flag: -Werror=format-security

  CC       state_tracker/st_cb_texturebarrier.lo
state_tracker/st_cb_eglimage.c: In function ‘st_egl_image_get_surface’:
state_tracker/st_cb_eglimage.c:64:7: error: format not a string literal and no format arguments [-Werror=format-security]
       _mesa_error(ctx, GL_INVALID_VALUE, error);
       ^~~~~~~~~~~
state_tracker/st_cb_eglimage.c:71:7: error: format not a string literal and no format arguments [-Werror=format-security]
       _mesa_error(ctx, GL_INVALID_OPERATION, error);
       ^~~~~~~~~~~

Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Fixes: 83e9de25f3 ("st/mesa: EGLImageTarget* error handling")
2017-03-30 11:24:36 +02:00
Kenneth Graunke
e4dc005bce i965: Combine intel_batchbuffer_reloc and intel_batchbuffer_reloc64
These two functions do the exact same thing.  One returns a uint64_t,
and the other takes the same uint64_t and truncates it to a uint32_t.

We only need the uint64_t variant - the caller can truncate if it wants.
This patch gives us one function, intel_batchbuffer_reloc, that does
the 64-bit thing.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-30 00:15:28 -07:00
Kenneth Graunke
5177231670 i965: Use WARN_ONCE instead of open coding it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-30 00:15:09 -07:00
Harish Krupo
36cb2003f1 android: pass sse4.1 flag as appropriate
We have functions which depend on sse4.1 support but we didnt pass
the right compile flag for it. This patch fixes it.

Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Signed-off-by: Harish Krupo <harish.krupo.kps@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-03-30 08:02:49 +03:00
Dave Airlie
a930c2c612 radv: fix mask attribs properly.
some days it just doesn't pay to get out of bed.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-30 13:09:30 +10:00
Dave Airlie
aa27a9f687 radv: fix regression with mask attrib setting code.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-30 12:07:32 +10:00
Dave Airlie
2b35b60df1 radv: move to using nir clip/cull merge pass.
Doing this before tessellation makes doing some bits of
tessellation a bit cleaner. It also cleans up a bit of the
llvm generator code.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-30 11:04:56 +10:00
George Kyriazis
5079c277b5 swr: [scons] Fix windows build
Fix codegen build break that was introduced earlier

v2: update rules for gen_knobs.cpp and gen_knobs.h

v3: Introduce bldroot and revert generator file changes, making patch simpler.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-03-29 18:52:07 -05:00
Craig Stout
1da7a11de8 anv/cmd_buffer: fix host memory leak
push_constants must be free'd.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100452
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-29 14:32:32 -07:00
Timothy Arceri
16debc652a mesa/glthread: fallback to sync if count validation fails
The old code would sync and then throw a cryptic error message.
There is no need for a custom error, we can just fallback to
the real function and have it do proper validation.

Fixes piglit test:
glsl-uniform-out-of-bounds

Which was returning the wrong error code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 08:23:00 +11:00
Timothy Arceri
18f4c93b02 mesa/glthread: add async support to glProgramUniform*() functions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 08:22:51 +11:00
Timothy Arceri
1ea73b9c61 mesa/glthread: print out syncs when MARSHAL_MAX_CMD_SIZE is exceeded
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-30 08:19:07 +11:00
Jason Ekstrand
9aba81b160 anv/batch_chain: Handle another OOM in cmd_buffer_execbuf
Found by inspection while rebasing other patches.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-29 09:39:49 -07:00
Philipp Zabel
83e9de25f3 st/mesa: EGLImageTarget* error handling
Stop trying to specify texture or renderbuffer objects for unsupported
EGL images. Generate the error codes specified in the OES_EGL_image
extension.

EGLImageTargetTexture2D and EGLImageTargetRenderbuffer would call
the pipe driver's create_surface callback without ever checking that
the given EGL image is actually compatible with the chosen target
texture or renderbuffer. This patch adds a call to the pipe driver's
is_format_supported callback and generates an INVALID_OPERATION error
for unsupported EGL images. If the EGL image handle does not describe
a valid EGL image, an INVALID_VALUE error is generated.

v2: fixed get_surface to actually use the usage and error parameters

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 18:04:42 +02:00
Philipp Zabel
d10172d527 st/mesa: move st_manager_get_egl_image_surface into st_cb_eglimage.c
The only callers are here, and we will add generation of GL errors in
the following patch.  Rename the function to st_egl_image_get_surface,
pass the gl_context instead of st_context, and move the cast from
GLeglImageOES to void* into st_egl_image_get_surface.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 18:04:12 +02:00
Alejandro Piñeiro
2f8d6bd578 i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+
Technically those hw operations are only available on gen7, as gen8+
support the conversion on the MOV. But, when using the builder to
implement nir operations (example: nir_op_fquantize2f16), it is not
needed to do the gen check. This check is done later, on the final
emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the
specific operation accordingly.

So in the middle, during optimization phases those hw operations can
be around for gen8+ too.

Without this patch, several (at least 95) vulkan-cts quantize tests
crashes when using INTEL_DEBUG=optimizer. For example:
dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert

v2: simplify the code using GEN_GE (Ilia Mirkin)
v3: tweak brw_instruction_name instead of changing opcode_descs
    table, that is used for validation (Matt Turner)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-29 17:34:15 +02:00
Marek Olšák
a2db9f9ff4 mesa: remove dd_function_table::BindProgram
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Marek Olšák
e81ee82119 r200: remove BindProgram
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Marek Olšák
bbb5561007 i915: remove BindProgram
The same thing is done in i915_update_program called by i915InvalidateState.
Why do it twice.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Marek Olšák
96a1c2406d mesa: don't use _NEW_TEXTURE mainly in mesa/main
v2: add missing %s

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Marek Olšák
d68150f15d mesa: split _NEW_TEXTURE into _NEW_TEXTURE_OBJECT & _NEW_TEXTURE_STATE
No performance testing has been done, because it makes sense to make this
change regardless of that. Also, _NEW_TEXTURE is still used in many places,
but the obvious occurences are replaced here.

It's now possible to split _NEW_TEXTURE_OBJECT further.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Marek Olšák
226ff6aa30 mesa: inline _mesa_update_texture
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-29 15:44:00 +02:00
Jose Fonseca
bb9faba172 appveyor: Update dependencies.
- Use explicit versions everywhere.
- Avoid deprecate `--egg` pip option.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-29 11:53:03 +01:00
Jose Fonseca
ecfafdcbf5 c11/threads: Include thr/xtimec.h for xtime definition when building with MSVC.
MSVC has been including a xtime definition in thr/xtimec.h ever since
MSVC 2013 (which is the minimum we require for building Mesa), and
including it prevents duplicate definitions when it gets included by
LLVM.

In fact, it looks that MSVC has been including a partial C11 threads
implementation too for some time, which we should consider migrating to
once we eliminate the use of _MTX_INITIALIZER_NP in our tree.

Thanks to the anonymous helper from
https://bugs.freedesktop.org/show_bug.cgi?id=100201#c4 for spotting
this.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100201
CC: "17.0" <mesa-stable@lists.freedesktop.org>
2017-03-29 11:53:03 +01:00
Timothy Arceri
e44cba540e mesa: update lower_jumps tests after bug fix
This change updates the tests to reflect the IR after
the following bug fix.

Fixes: c1096b7f1d ("glsl: fix lower jumps for returns when loop is
                      inside an if")

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Bugzilla: https://bugs.freedesktop.org/100441
2017-03-29 20:53:06 +11:00
Thomas Hellstrom
ba8df2286a gbm/dri: Flush after unmap
Drivers may queue dma operations on the context at unmap time so we need
to flush to make sure the data gets to the bo. Ideally the application
would take care of this, but since there appears to be no exported gbm
flush functionality we need to explicitly flush at unmap time.

This fixes a problem where kmscube on vmwgfx in rgba textured mode would
render using an uninitialized texture rather than the intended
rgba pattern.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-29 09:17:21 +02:00
Bas Nieuwenhuizen
3df410069a radv: Enable sparseBinding feature.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:55 +02:00
Bas Nieuwenhuizen
b20af5c8d7 radv/amdgpu: Use reference counting for bos.
Per the Vulkan spec, memory objects may be deleted before the buffers
and images using them are deleted, although those resources then
cannot be used except for deletion themselves.

For the virtual buffers, we need to access them on resource destruction
to unmap the regions, so this results in a use-after-free. Implement
reference counting to avoid this.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:48 +02:00
Bas Nieuwenhuizen
e527e62e75 radv: Implement sparse memory binding.
v2: Only submit when semaphores are specified.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:41 +02:00
Bas Nieuwenhuizen
6154efc193 radv: Implement sparse image creation.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:37 +02:00
Bas Nieuwenhuizen
ef0e505d02 radv: Implement sparse buffer creation.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:33 +02:00
Bas Nieuwenhuizen
715df30a4e radv/amdgpu: Add winsys implementation of virtual buffers.
v2: - Added comments.
    - Fixed a double unmap bug.
    - Actually unmap the non-edge old ranges.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-29 08:50:17 +02:00
Bas Nieuwenhuizen
78ee8b3f84 radv: Assert when setting 0 registers in a sequence.
To catch more of those hangs early.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-29 01:58:16 +02:00
Jason Ekstrand
f3673db3d6 anv/cmd_buffer: Refactor flush_pipeline_select_*
While having the _3d and _gpgpu versions is nice, there's no reason why
we need to have duplicated logic for tracking the current pipeline.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-03-28 14:57:09 -07:00
Jason Ekstrand
6baae9625d anv: Flush caches prior to PIPELINE_SELECT on all gens
The programming note that says we need to do this still exists in the
SkyLake PRM and, from looking at the bspec, seems like it may apply to
all hardware generations SNB+.  Unfortunately, this isn't particularly
clear cut since there is also language in the bspec that says you can
skip the flushing and stall to get better throughput.  Experimentation
with the "Car Chase" benchmark in GL seems to indicate that some form of
flushing is still needed.  This commit makes us do the full set of
flushes regardless of hardware generation.  We can always reduce the
flushing later.

Reported-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:57:08 -07:00
Jason Ekstrand
0fe3dcce4c anv/cmd_buffer: Fix bad indentation
A bunch of code was indented in such a way that it looked like it went
with the if statement above but it definitely didn't.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:57:06 -07:00
Jason Ekstrand
01a65dc43b anv/cmd_buffer: Apply flush operations prior to executing secondaries
This fixes rendering issues in the Vulkan port of skia on some hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:56:55 -07:00
Jason Ekstrand
9319ef96fd anv/blorp: Use anv_get_layerCount everywhere
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:41:48 -07:00
Jason Ekstrand
1b8fa8dd79 anv: Make anv_get_layerCount a macro
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-28 14:41:47 -07:00
Dave Airlie
93d61e4945 radv: only emit ps_input_cntl is we have any to output
Otherwise we get GPU hangs.

Reported-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 20:12:10 +01:00
Adam Jackson
f208bdc0d2 glx: Remove #include <GL/glxint.h>
We're not using anything in it, and we don't want to inherit struct
definitions from some other package anyway.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-03-28 14:48:12 -04:00
Julien Isorce
7ee91af300 r600g: check NULL return from r600_aligned_buffer_create
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 18:27:55 +01:00
Julien Isorce
699cce3493 st_cb_bitmap: check NULL return from u_upload_alloc
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 18:27:55 +01:00
Julien Isorce
4a5e779b5f si_compute: check NULL return from u_upload_alloc
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 18:27:53 +01:00
Julien Isorce
c5fe99eec2 r600g: check NULL return from u_upload_alloc
Like done in si_state_draw.c::si_draw_vbo

u_upload_alloc can fail, i.e. set output param *ptr to NULL, for 2 reasons:
alloc fails or map fails. For both there is already a fprintf/stderr in
radeon_create_bo and radeon_bo_do_map.

In src/gallium/drivers/ it is a common usage to just avoid to crash by doing
a silent check. But defer fprintf where the error comes from, libdrm calls.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 17:54:15 +01:00
Tim Rowley
749cf3be6e swr: fix llvm-5.0.0 build bustage
Handle rename of llvm AttributeSet to AttributeList in the same
fashion as ac_llvm_helper.cpp.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-28 11:46:58 -05:00
Tim Rowley
79d92a72d5 swr: [rasterizer jitter] fix llvm-5.0.0 build bustage
Add CreateAlignmentAssumptionHelper to gen_llvm_ir_macros.py ignore list.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-28 11:46:58 -05:00
Chad Versace
d1032a047b isl: Drop unused isl_surf_init_info::min_pitch
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-28 09:44:44 -07:00
Chad Versace
6cbc13d94c intel: Fix requests for exact surface row pitch (v2)
All callers of isl_surf_init() that set 'min_row_pitch' wanted to
request an *exact* row pitch, as evidenced by nearby asserts, but isl
lacked API for doing so. Now that isl has an API for that, update the
code to use it.

v2: Assert that isl_surf_init() succeeds because the callers assume
    it.  [for jekstrand]

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2017-03-28 09:44:44 -07:00
Chad Versace
e9017d58dc isl: Let isl_surf_init's caller set the exact row pitch (v2)
The caller does so by setting the new field
isl_surf_init_info::row_pitch.

v2: Validate the requested row_pitch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2017-03-28 09:44:44 -07:00
Chad Versace
23802dafc2 isl: Validate the calculated row pitch (v45)
Validate that isl_surf::row_pitch fits in the below bitfields,
if applicable based on isl_surf::usage.

    RENDER_SURFACE_STATE::SurfacePitch
    RENDER_SURFACE_STATE::AuxiliarySurfacePitch
    3DSTATE_DEPTH_BUFFER::SurfacePitch
    3DSTATE_HIER_DEPTH_BUFFER::SurfacePitch

v2:
  -Add a Makefile dependency on generated header genX_bits.h.
v3:
  - Test ISL_SURF_USAGE_STORAGE_BIT too. [for jekstrand]
  - Drop explicity dependency on generated header. [for emil]
v4:
  - Rebase for new gen_bits_header.py script.
  - Replace gen_10x with gen_device_info*.
v5:
  - Drop FINISHME for validation of GEN9 1D row pitch. [for jekstrand]
  - Reformat bit tests. [for jekstrand]

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v4)
2017-03-28 09:44:44 -07:00
Chad Versace
f0eaf38db2 genxml: New generated header genX_bits.h (v6)
genX_bits.h contains the sizes of bitfields in genxml instructions,
structures, and registers. It also defines some functions to query those
sizes.

isl_surf_init() will use the new header to validate that requested
pitches fit in their destination bitfields.

What's currently in genX_bits.h:

  - Each CONTAINER::Field from gen*.xml that has a bitsize has a macro
    in genX_bits.h:

        #define GEN{N}_CONTAINER_Field_bits {bitsize}

  - For each set of macros whose name, after stripping the GEN prefix,
    is the same, genX_bits.h contains a query function:

      static inline uint32_t __attribute__((pure))
      CONTAINER_Field_bits(const struct gen_device_info *devinfo);

v2 (Chad Versace):
  - Parse the XML instead of scraping the generated gen*_pack.h headers.

v3 (Dylan Baker):
  - Port to Mako.

v4 (Jason Ekstrand):
  - Make the _bits functions take a gen_device_info.

v5 (Chad Versace):
  - Fix autotools out-of-tree build.
  - Fix Android build. Tested with git://github.com/android-ia/manifest.
  - Fix macro names. They were all missing the "_bits" suffix.
  - Fix macros names more. Remove all double-underscores.
  - Unindent all generated code. (It was floating in a sea of whitespace).
  - Reformat header to appear human-written not machine-generated.
  - Sort gens from high to low. Newest gens should come first because,
    when we read code, we likely want to read the gen8/9 code and ignore
    the gen4 code. So put the gen4 code at the bottom.
  - Replace 'const' attributes with 'pure', because the functions now
    have a pointer parameter.
  - Add --cpp-guard flag. Used by Android.
  - Kill class FieldCollection. After Jason's rewrite, it was just
    a dict.

v6 (Chad Versace):
  - Replace `key not in d.keys()` with `key not in d`. [for dylan]

Co-authored-by: Dylan Baker <dylan@pnwbakers.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v5)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v6)
2017-03-28 09:44:44 -07:00
Tim Rowley
3974cfea25 swr: [rasterizer core] Disable inline function expansion
Disable expansion in windows Debug builds.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:44 -05:00
Tim Rowley
1c7224c85f swr: [rasterizer common] Use C++ thread_local keyword
Allows use of thread_local objects with constructors.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:39 -05:00
Tim Rowley
aee5276375 swr: [rasterizer core] SIMD16 Frontend WIP
Implement widened clipper and binner interfaces for SIMD16.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:33 -05:00
Tim Rowley
aea737e12e swr: [rasterizer core] Don't bind single-threaded contexts
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:27 -05:00
Tim Rowley
4cd0b1bb2c swr: [rasterizer core] Enable SIMD16
Make the AVX512 insert/extract intrinsics KNL-compatible

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:21 -05:00
Tim Rowley
ec51e8ecfe swr: [rasterizer jitter] Clean up EngineBuilder construction
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:14 -05:00
Tim Rowley
89b83f4b1e swr: [rasterizer codegen] add cmdline to archrast gen files
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:24:09 -05:00
Tim Rowley
549b9d2e9f swr: [rasterizer core] SIMD16 Frontend WIP
Fix GS and streamout.

Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
2017-03-28 11:23:45 -05:00
Tim Rowley
fee3fc018b swr: [rasterizer codegen] Refactor codegen
Move common codegen functions into gen_common.py.

v2: change gen_knobs.py to find the template file internally, like
the rest of the gen scripts.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-28 11:23:04 -05:00
Juan A. Suarez Romero
caa616ccc4 tests/cache_test: allow crossing mount points
When using an overlayfs system (like a Docker container), rmrf_local()
fails because part of the files to be removed are in different mount
points (layouts). And thus cache-test fails.

Letting crossing mount points is not a big problem, specially because
this is just for a test, not to be used in real code.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-28 18:00:39 +02:00
Emil Velikov
0f9a0cb5f5 glcpp/tests/glcpp-test-cr-lf: error out if we cannot find any tests
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
d8096b75aa glcpp/tests/glcpp-test-cr-lf: correctly set/use srcdir/abs_builddir
Otherwise manual invokation of the script from elsewhere than
`dirname $0` will fail.

With these all the artefacts should be created in the correct location,
and thus we can remove the old (and slighly strange) clean-local line.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
cf77cdce83 glcpp/tests: update testname in help string
Rather than hardcoding glcpp/other use `basename "$0"` which expands
appropriatelly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
4ea4fbf93a glcpp/tests/glcpp-test: error out if we cannot find any tests
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
182d48ceb9 glcpp/tests/glcpp-test: print only the test basename
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
addf62946d glcpp/tests/glcpp-test: set srcdir/abs_builddir variables
Current definitions work fine for the manual invokation of the script,
although the whole script does not consider that one can run it OOT.

The latter will be handled with latter patches, although it will be
extensively using the two variables.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
ee8aea3572 glsl/tests/optimization-test: 'echo' only folders which has generators
The current "let's print any folder which exists" is simply confusing.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
79a95f19e6 glsl/tests/optimization-test: print only the test basedir/name
The relative/absolute path brings little to no benefit in being printed
as testname. Trim it out.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:24 +01:00
Emil Velikov
33cd136fa2 glsl/tests/optimization-test: error if zero tests were executed
We don't want to lie ourselves that 'everything is fine' when no tests
were found/ran.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
421115a729 glsl/tests/optimization-test: pass glsl_test as argument
Rather than hardcoding the binary location (which ends up wrong in a
number of occasions) in the python script, pass it as argument.

This allows us to remove a couple of dirname/basename workarounds that
aimed to keep this working, and succeeded in the odd occasion.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
7d2a1394bb glsl/tests/optimization-test: error out if we fail to generate any tests
v2: use -eq over a string comparison (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
86a937d264 glsl/tests/optimization-test: correctly manage srcdir/builddir
At the moment we look for generator script(s) in builddir while they
are in srcdir, and we proceed to generate the tests and expected output
in srcdir, which is not allowed.

To untangle:
 - look for the generator script in the correct place
 - generate the files in builddir, by extending create_test_cases.py to
use --outdir

With this in place the test passes `make check' for OOT builds - would
that be as standalone or part of `make distcheck'

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
a7d9f0a361 glsl/tests/optimisation-test: ensure that compare_ir is available
Bail out early if the script is not where we expect it to be.

v2: use -f instead of -e. latter returns true on folder(s)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
9083c625f5 glsl/tests/optimization-test: correctly set compare_ir
Now that we have srcdir we can use it to correctly manage/point to the
script. Effectively fixing OOT invokation of `make check'.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
44b6422258 glsl/tests/optimization-test: add fallback srcdir/abs_builddir defines
There is no robust way to detect either one, so simply hope for the best
and warn just in case.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
05bc5b35a7 glsl/tests/optimisation-test: make sure that $PYTHON2 is set/available
Otherwise we'll fail when invoking the script outside of "make check"

v2: use -ne over a string comparison (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
bd4be79fc5 glsl/tests/warnings-test: print only the test basename
Spamming the log with the (in some cases extremely long) test location
is of limited use.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:23 +01:00
Emil Velikov
1c58d08bd9 glsl/tests/warnings-test: error if zero tests were executed
We don't want to lie ourselves that 'everything is fine' when no tests
were found/ran.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Emil Velikov
493fa69e37 glsl/tests/warnings-test: correctly manage srcdir/builddir
Before this commit, we would effectively fail to run any of the test in
a OOT builds.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Emil Velikov
81ccc7a484 glsl/tests/warnings-test: add fallback srcdir/abs_builddir defines
There is no robust way to detect either one, so simply hope for the best
and warn just in case.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Emil Velikov
4b366b171d glsl/tests/warnings-test: error out if glsl_compiler is missing
... or non-executable, in particular.

v2: use test -x (Eric)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Emil Velikov
1d93fa7be4 glsl: automake: export abs_builddir for the tests
We're going to use them with the next commits to determine where to put
the generated tests and/or built binaries.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Emil Velikov
841f0d2c58 glsl/tests: automake: cleanup all artefacts during clean-local
With later commits we'll fix the generators to produce the files in the
correct location. That in itself will cause an issue since the files
will be left dangling and make distcheck will fail.

v2: Use -r only as needed (Eric)

Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-28 15:31:22 +01:00
Nayan Deshmukh
3472be2bfd st/va: remove assert for single slice
we anyway allow for multiple slices

v2: do not remove assert to check for buf->size

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-28 12:08:54 +02:00
Nicolai Hähnle
21ba6543be radeonsi: use DMA for clears with unaligned size
Only a small tail needs to be uploaded manually.

This is only partly a performance measure (apps are expected to use
aligned access). Mostly it is preparation for sparse buffers, which the
old code would incorrectly have attempted to map directly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 10:22:14 +02:00
Nicolai Hähnle
f0d9af772e radeonsi: CP DMA clear supports unaligned destination addresses
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 10:22:12 +02:00
Nicolai Hähnle
d9014952f5 radeonsi: remove the early-out for SDMA in si_clear_buffer
This allows the next patches to be simple while still being able
to make use of SDMA even in some unusual cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-28 10:22:01 +02:00
Dave Airlie
239a9224a3 radv: move shader stages calculation to pipeline.
With tess this becomes a bit more complex. so move to pipeline
for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:33 +10:00
Dave Airlie
0232ea8025 radv: move pa_cl_vs_out_cntl calculation to pipeline
This also takes the side band setting code from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:29 +10:00
Dave Airlie
92e9c14a6a radv: move calculating fragment shader i/os to pipeline.
There is no need to calculate this on each command submit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:20 +10:00
Dave Airlie
4b467c759e radv: move shader_z_format calculation to pipeline.
No need to recalculate this every time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:17 +10:00
Dave Airlie
8996fdbf61 radv: move db_shader_control calculation to pipeline.
There is no need to recalculate this every time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:14 +10:00
Dave Airlie
cd33a5c1cb radv: move vgt_gs_mode value to pipeline.
No need to recalculate this everytime.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:08 +10:00
Dave Airlie
d43691ce77 radv: add parameter to emit_waitcnt.
This is just a precursor for tess support, which needs to
pass different values here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:03 +10:00
Dave Airlie
931a8d0c9a radv: rework vertex/export shader output handling
In order to faciliate adding tess support, split the vs/es
output info into a separate block, so we make it easier to
have the tess shaders export the same info.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:59 +10:00
Dave Airlie
ae0551b4b3 radv: fix ia_multi_vgt_param for instanced vs indirect draw.
The logic was different than radeonsi, fix it up before adding
tess support.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:55 +10:00
Dave Airlie
a8b8e542c2 radv: handle NULL multisample state.
If rasterization is disabled, we can get a NULL multisample
state.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:38 +10:00
Bas Nieuwenhuizen
a8c51b1cd9 radv: flush DB cache before and after HTILE decompress.
It reads @ writes the DB cache, and we haven't flushed dst caches yet,
so DB cache may be stale. Also the user might be shader read (and probably is),
so also flush after.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
2017-03-28 02:51:40 +02:00
Anuj Phogat
f5c32b0762 i965: Delete tile resource mode code
Yf/Ys tiling never got used in i965 due to not delivering
the expected performance benefits. So, this patch is deleting
this dead code in favor of adding it later in ISL when we
actually find it useful. ISL can then share this code between
vulkan and GL.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-27 16:17:18 -07:00
Anuj Phogat
bcee124ef7 i965: Delete fast copy blit code
Fast copy blit was primarily added to support Yf/Ys detiling.
But, Yf/Ys tiling never got used in i965 due to not delivering
the expected performance benefits. Also, replacing legacy blits
with fast copy blit didn't help the benchmarking numbers. This
is probably due to a h/w restriction that says "start pixel for
Fast Copy blit should be on an OWord boundary". This restriction
causes many blit operations to skip fast copy blit and use legacy
blits. So, this patch is deleting this dead code in favor of
adding it later when we actually find it useful.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-27 16:17:18 -07:00
Kenneth Graunke
088449487e i965: Require Kernel 3.6 for Gen4-5 platforms.
We've already required Kernel 3.6 on Gen6+ since Mesa 9.2 (May 2013,
commit 92d2f5acfa).  It seems reasonable
to require it for Gen4-5 as well, bumping the requirement from 2.6.39.

This is necessary for glClientWaitSync with a timeout to work, which
is a feature we expose on Gen4-5.  Without it, we would fall back to an
infinite wait, which is pretty bad.

See kernel commit 172cf15d18889313bf2c3bfb81fcea08369274ef in 3.6+.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-27 15:57:50 -07:00
Timothy Arceri
99dd3d1c3b glsl: fix spelling of embedded in comment 2017-03-28 09:56:27 +11:00
Timothy Arceri
c1096b7f1d glsl: fix lower jumps for returns when loop is inside an if
Previously we would just escape the loop and move everything
following the loop inside the if to the else branch of a new if
with a return flag conditional. However everything outside the
if the loop was nested in would still get executed.

Adding a new return to the then branch of the new if fixes this
and we just let a follow pass clean it up if needed.

Fixes:
tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test
tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-28 09:54:31 +11:00
Dave Airlie
b640dfcd05 radv: don't emit no color formats. (v3)
If we had no rasterization, we'd emit SPI color
format as all 0's the hw dislikes this, add the workaround
from radeonsi.

Found while debugging tessellation

v2: handle at pipeline stage, we have to handle
it after we process the fragment shader. (Bas)
v3: simplify even further, remove old fallback.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 08:39:14 +10:00
Vinson Lee
f1f1cb41d0 mesa/tests: Link main-test with CLOCK_LIB.
Fix 'make check' linking error with glibc < 2.17.

  CXXLD  main-test
../../../../src/mesa/.libs/libmesa.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-27 14:36:34 -07:00
Matt Turner
7dccd38b40 i965/fs: Don't emit SEL instructions for type-converting MOVs.
SEL can only convert between a few integer types, which we basically
never do.

Fixes fs/vs-double-uniform-array-direct-indirect-non-uniform-control-flow
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-03-27 10:59:42 -07:00
Xu Randy
004468de14 anv/blorp: Fix a crash in CmdClearColorImage
We should use anv_get_layerCount() to access layerCount of VkImageSub-
resourceRange in anv_CmdClearColorImage and anv_CmdClearDepthStencil-
Image, which handles the VK_REMAINING_ARRAY_LAYERS (~0) case.

Test: Sample multithreadcmdbuf from LunarG can run without crash

Signed-off-by: Xu Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-27 07:43:17 -07:00
Brian Paul
804676f384 mesa: simplify code around 'variable_data' in marshal.c
Remove needless pointer increments, unneeded vars, etc.  Untested.
Plus, fix a couple comments.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-27 08:30:43 -06:00
Brian Paul
b71ef173a5 st/mesa: move duplicated st_ws_framebuffer() function into header file
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-27 08:30:43 -06:00
Andres Gomez
6255cc654d glsl: Interface Block instances don't need linking validation
From page 45 (page 52 of the PDF) of the GLSL ES 3.00 v.6 spec:

  " When instance names are present on matched block names, it is
    allowed for the instance names to differ; they need not match for
    the blocks to match.

From page 51 (page 57 of the PDF) of the GLSL 4.30 v.8 spec:

  " When instance names are present on matched block names, it is
    allowed for the instance names to differ; they need not match for
    the blocks to match."

Therefore, no cross linking validation is needed for the instance name
of an Interface Block.

This patch will make that no link error will be reported on a program
like this:

    "# VS

    layout(binding = 1) Block1 {
      vec4 color;
    } uni_block;

    ...

    # FS

    layout(binding = 2) Block2 {
      vec4 color;
    } uni_block;

    ..."

Fixes GL45-CTS.enhanced_layouts.ssb_layout_qualifier_conflict

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-27 12:47:21 +03:00
Andres Gomez
40b09ed15c glsl: UBOs and SSBOs must match the binding qualifier too
From page 140 (page 147 of the PDF) of the GLSL ES 3.10 v.4 spec:

  " 9.2 Matching of Qualifiers

    The following tables summarize the requirements for matching of
    qualifiers.  It applies whenever there are two or more matching
    variables in a shader interface.

    Notes:

    1. Yes means the qualifiers must match.

    ...

    9.2.1 Linked Shaders

    | Qualifier | Qualifier | in/out | Default  | uniform | buffer|
    |   Class   |           |        | Uniforms |  Block  | Block |

    ...

    |  Layout   |  binding  |  N/A   |   Yes    |   Yes   |  Yes  |"

From page 93 (page 110 of the PDF) of the GL 4.2 (Core Profile) spec:

  " 2.11.7 Uniform Variables

    ...

    Uniform Blocks

    ...

    When a named uniform block is declared by multiple shaders in a
    program, it must be declared identically in each shader. The
    uniforms within the block must be declared with the same names and
    types, and in the same order. If a program contains multiple
    shaders with different declarations for the same named uniform
    block differs between shader, the program will fail to link."

From page 129 (page 150 of the PDF) of the GL 4.3 (Core Profile) spec:

  " 7.8 Shader Buffer Variables and Shader Storage Blocks

    ...

    When a named shader storage block is declared by multiple shaders
    in a program, it must be declared identically in each shader. The
    buffer variables within the block must be declared with the same
    names, types, qualification, and declaration order. If a program
    contains multiple shaders with different declarations for the same
    named shader storage block, the program will fail to link."

Therefore, if the binding qualifier differs between two linked Uniform
or Shader Storage Blocks of the same name, a link error should happen.

This patch will make that a link error will be reported on a program
like this:

    "# VS

    layout(binding = 1) Block {
      vec4 color;
    } uni_block1;

    ...

    # FS

    layout(binding = 2) Block {
      vec4 color;
    } uni_block2;

    ..."

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-27 12:47:00 +03:00
Andres Gomez
bf15b2b515 glsl: on UBO/SSBOs link error reset the number of active blocks to 0
While it's legal to have an active blocks count > 0 on link failure.
Unless we actually assign memory for the blocks array we can end up
segfaulting in calls such as glUniformBlockBinding().

To avoid having to NULL check these api calls we simply reset the
block count to 0 if the array was not created.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-27 12:45:59 +03:00
Samuel Iglesias Gonsálvez
c4c02471f4 anv: enable sampling from fast-cleared images on SKL
A resolve is not needed on Skylake in this case. We were forcing
a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-03-27 06:32:24 +02:00
Grazvydas Ignotas
b97faea162 glsl, st/shader_cache: check the whole sha1 for zero
The checks were only looking at the first byte, while the intention
seems to be to check if the whole sha1 is zero. This prevented all
shaders with first byte zero in their sha1 from being saved.

This shaves around a second from Deus Ex load time on a hot cache.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-27 15:05:10 +11:00
Grazvydas Ignotas
f2d4d11611 glsl/shader_cache: restore evicted shader keys
Even though the programs themselves stay in cache and are loaded, the
shader keys can be evicted separately. If that happens, unnecessary
compiles are caused that waste time, and no matter how many times the
program is re-run, performance never recovers to the levels of first
hot cache run. To deal with this, we need to refresh the shader keys
of shaders that were recompiled.

An easy way to currently observe this is running Deux Ex, then piglit
and Deux Ex again, or deleting just the cache index. The later is
causing over a minute of lost time on all later Deux Ex runs, with this
patch it returns to normal after 1 run.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-27 09:10:16 +11:00
Axel Davy
bdf035ea6f st/nine: Use atomics for available_texture_mem
Resource dtor can be executed in the worker thread.
Use atomic to avoid threading safety issues.

CC: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: James Harvey <lothmordor@gmail.com>
2017-03-26 23:10:38 +02:00
Axel Davy
bd85bb51c7 st/nine: Resolve deadlock in surface/volume dtors when using csmt
Surfaces and Volumes can be freed in the worker thread.

Without this patch, pending_uploads_counter could be non-zero
in the Surfaces or Volumes dtor, leading to deadlock.
Instead decrease properly the counter before releasing the
item.

Also avoid another potential deadlock if the item is not
properly unlocked: Do not call UnlockRect which will cause deadlock,
but free directly using the deadlock safe
nine_context_get_pipe_multithread.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99246

CC: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Tested-by: James Harvey <lothmordor@gmail.com>
2017-03-26 23:10:38 +02:00
Axel Davy
31f8b3babb st/nine: Fix user vertex data uploader with csmt
Fix regression caused by
abb1c645c4

The patch made csmt use context.pipe instead of
secondary_pipe, leading to thread safety issues.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
2017-03-26 23:10:38 +02:00
Jose Fonseca
2ba991cbcd scons: Fix dependencies of marshal_generated.[ch].
These generated source files depend not only upon gl_and_es_API.xml, but
all other XML files that are included by it.

This change updates the generation rules to depend on all gen/*.xml
files, like done for other SCons generation rules, and should fix
incremental broken SCons builds due to missing dependencies.

Trivial.
2017-03-26 21:30:34 +01:00
Vinson Lee
641f629536 glsl: Link tests with CLOCK_LIB.
Fix 'make check' linking errors with glibc < 2.17.

  CXXLD  glsl/glsl_test
glsl/.libs/libglsl.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-25 01:23:04 -07:00
Timothy Arceri
425671f616 mesa/glthread: add custom marshalling for ClearBufferfv()
This is one of the main causes of syncs in Civ6.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-25 13:39:12 +11:00
Grazvydas Ignotas
b9e92334f7 util/disk_cache: don't deadlock on premature EOF
If we get EOF earlier than expected, the current read loops will
deadlock. This may easily happen if the disk cache gets corrupted.
Fix it by using a helper function that handles EOF.

Steps to reproduce (on a build with asserts disabled):
$ glxgears
$ find ~/.cache/mesa/ -type f -exec truncate -s 0 '{}' \;
$ glxgears # deadlock

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-25 13:08:37 +11:00
Chad Versace
7414326164 genxml: Add 3DSTATE_DEPTH_BUFFER to gen5.xml
isl will use this for validating the depth buffer pitch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 19:07:05 -07:00
Grazvydas Ignotas
7d8ee4b4d0 tests/cache_test: mark arguments const
While at it, also fix up a failure message to not reference timestamp
and gpu dirs as those are no longer being made.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-25 12:46:18 +11:00
Rob Clark
d87ef8f77c freedreno: free compiler when screen is destroyed
Drop ir3_compiler_destroy(), since it is only ralloc_free() and we
shouldn't really have an ir3 dependency in core.  If some future hw
has a new compiler, as long as all it's resources are ralloc()d then
things will all just work.

(In practice, I suppose you never really see this leak, but removing
it at least cleans up some noise in valgrind.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-03-24 18:01:47 -04:00
Jason Ekstrand
e6621746dc genxml: Whitespace fixes
Some field names had extra spaces and some had places where we should
have had a space but didn't.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Jason Ekstrand
34c3f6a27f genxml: Replace "[N]" with "N"
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Jason Ekstrand
c2af555d6e genxml/gen6: Remove a couple of bogus values
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Jason Ekstrand
ec27402a8f genxml/gen8: Remove BLACK_LEVEL_CORRECTION_STATE
We've never used it, it only exists on gen8, and the name of the struct
contains piles of bad characters.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Jason Ekstrand
a6df637d26 genxml: Rename two MCS fields to Auxiliary Surface on gen7
This makes gen7 more consistent with gen8+

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-24 15:00:37 -07:00
Rob Clark
c03f6f12bb freedreno: fix memory leak
Otherwise blitter would still hold a ref to, for example, sampler-
views.

To reproduce:

   glmark2 -b desktop:duration=2 --run-forever

Fixes: a8e6734 ("freedreno: support for using generic clear path")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-03-24 17:49:00 -04:00
Chad Versace
b3f81e06d4 genxml: Fix gen_zipped_file.py dependency
The gen*_xml.h files depend on gen_zipped_file.py, not the gen*_pack.h
files.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-24 14:38:22 -07:00
Chad Versace
c7c6c53adb genxml: Define GENXML_XML_FILES in Makefile.sources
The future header genX_bits.h will depend on GENXML_XML_FILES.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-24 14:38:15 -07:00
Jan Vesely
14b543bdc9 clover: use pipe_resource references
v2: buffers are created with one reference.
v3: add pipe_resource reference to mapping object
v4: rename to pres and drop inline initializers

CC: "17.0 13.0" <mesa-stable@lists.freedesktop.org>

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-24 15:57:47 -04:00
Kenneth Graunke
0a60ff4d8c i965: Fix symbolic size of next_offset[] array.
It's indexed by buffer, not stream.  BRW_MAX_SOL_BUFFERS and
MAX_VERTEX_STREAMS happen to both be 4, so there's no actual bug.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-24 12:21:50 -07:00
Kenneth Graunke
652d521408 i965: Remove pointless NULL check from Gen6 primitive counting code.
We create the BO when creating a transform feedback object, and only
destroy it when deleting that object.  So it won't be NULL.

CID: 1401410

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-24 12:21:06 -07:00
Marek Olšák
61926733f9 radeonsi: don't crash on compute shader compile failure
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 18:25:05 +01:00
Marek Olšák
518d834162 radeonsi: don't hang on shader compile failure
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 18:25:05 +01:00
Nicolai Hähnle
eebd0cd560 radeonsi: fix dvec[34] attributes sourced from current attribute state
The state tracker no longer uploads those attributes for us,
so we must conservatively upload the size of the largest
attribute, which is a dvec4.

Fixes a regression of GL45-CTS.gpu_shader_fp64.varyings and
GL45-CTS.vertex_attrib_64bit.limits_test.

Fixes: 9b91e0b54c ("radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-24 17:35:21 +01:00
Emil Velikov
15603055fb anv: automake: ensure that the destination directory is created
Earlier commit unintentionally dropped the mkdir, as it was rebased.

Some versions of autotools will not create the output directory for
generated sources. Thus the issue went unnoticed by the original author.

Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Steven Newbury <steve@snewbury.org.uk>
Reported-by: Steven Newbury <steve@snewbury.org.uk> Fixes:
Fixes: 1610b3dede ("anv: don't pass xmlfile via stdin anv_entrypoints_gen.py")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-24 12:02:04 +00:00
Samuel Pitoiset
43f5a2c915 glsl_to_tgsi: don't rely on glsl types when visiting tex instructions
Instead add is_cube_shadow like is_cube_array.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 11:12:27 +01:00
Iago Toral Quiroga
129fd58131 anv/query: handle out of host memory without crashing in compute_query_result()
We don't need to make the caller (CmdCopyQueryPoolResults) aware of the
problem since compute_query_result() only emits state. The caller is also
expected to hit OOM in this scenario right after calling this function, but
it is already handling it safely.

Fixes:
dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-24 09:39:44 +01:00
Iago Toral Quiroga
ddb2bb3ed4 anv/pipeline: make FragCoord include sample positions when sample shading
We need to know if sample shading has been requested during shader
compilation since that affects the way fragment coordinates are
computed.

Notice that the semantics of fragment coordinates only depend on
whether sample shading has been requested, not on whether more
than one sample will actually be produced (that is,
minSampleShading and rasterizationSamples do not affect this
behavior).

Because this setting affects the code we generate for the shader, we also
need to include it in the WM prog key. Notice we don't need to alter the
OpenGL code because it doesn't ever use this behavior, so they key's
value is always false (the default).

Fixes:
dEQP-VK.glsl.builtin_var.fragcoord_msaa.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Iago Toral Quiroga
023ea3772d nir/lower_wpos_center: support adding sample position to fragment coordinate
According to section 14.6 of the Vulkan specification:

   "When sample shading is enabled, the x and y components of FragCoord
    reflect the location of the sample corresponding to the shader
    invocation."

So add a boolean parameter to the lowering pass to select this behavior
when we need it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Iago Toral Quiroga
4da1832c00 anv: return VK_ERROR_DEVICE_LOST immeditely when device is known to be lost
If we know the device has been lost we should return this error code for
any command that can report it before we attempt to do anything with the
device.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Iago Toral Quiroga
50c8d2c1f7 anv/device: keep track of 'device lost' state
The Vulkan specs say:

   "A logical device may become lost because of hardware errors, execution
    timeouts, power management events and/or platform-specific events. This
    may cause pending and future command execution to fail and cause hardware
    resources to be corrupted. When this happens, certain commands will
    return VK_ERROR_DEVICE_LOST (see Error Codes for a list of such commands).
    After any such event, the logical device is considered lost. It is not
    possible to reset the logical device to a non-lost state, however the lost
    state is specific to a logical device (VkDevice), and the corresponding
    physical device (VkPhysicalDevice) may be otherwise unaffected. In some
    cases, the physical device may also be lost, and attempting to create a
    new logical device will fail, returning VK_ERROR_DEVICE_LOST."

This means that we need to track if a logical device has been lost so we can
have the commands referenced by the spec return VK_ERROR_DEVICE_LOST
immediately.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Iago Toral Quiroga
70194c9f1a anv/device: return VK_ERROR_DEVICE_LOST for errors during queue submissions
So that we don't have to do things like rolling back address relocations in
case that we ran into OOM after computing them, etc

Also, make sure that if the queue submission comes with a fence, we set it up
correctly so it behaves according to the spec after returning
VK_ERROR_DEVICE_LOST.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Timothy Arceri
adced4a2f9 mesa/marshal: add custom BufferData/BufferSubData marshalling
GL_AMD_pinned_memory requires memory to be aligned correctly, so
we skip marshalling in this case. Also copying the data defeats
the purpose of EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD.

Fixes GL_AMD_pinned_memory piglit tests when glthread is enabled.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-24 11:23:06 +11:00
Timothy Arceri
0a32b52a27 util/disk_cache: write cache entry keys to file header
This can be used to deal with key hash collisions from different
versions (should we find that to actually happen) and to find
which mesa version produced the cache entry.

V2: use blob created at cache creation.

v3: remove left over var from v1.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-24 11:20:09 +11:00
Grazvydas Ignotas
5136c09e70 util/disk_cache: hash pointer size and gpu name into cache keys
This allows to get rid of the arch and gpu name directories.

v2: (Timothy Arceri) don't use an opaque data type to store
    pointer size and gpu name.

v3: (Timothy Arceri) use blob to store driver keys just make sure
    to store null terminator for strings, and make sure blob is
    defined by disk_cache and not it's users.

v4: (Timothy Arceri) fix typo, and make ptr_size a uint8_t.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-24 11:20:09 +11:00
Grazvydas Ignotas
feb716239e util/disk_cache: hash timestamps into the cache keys
Instead of using a directory, hash the timestamps into the cache keys
themselves. Since there is no more timestamp directory, there is no more
need for deleting the cache of other mesa versions and we rely on
eviction to clean up the old cache entries. This solves the problem of
using several incarnations of disk_cache at the same time, where one
deletes a directory belonging to the other, like when both OpenGL and
gallium nine are used simultaneously (or several different mesa
installations).

v2: using additional blob instead of trying to clone sha1 state

v3: (Timothy Arceri) don't use an opaque data type to store
    timestamp.

V4: (Timothy Arceri) use blob to store driver keys just make sure
    to store null terminator for strings, and make sure blob is
    defined by disk_cache and not it's users.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100091
2017-03-24 11:20:09 +11:00
Miklós Máté
7ceb1a4fa8 mesa: set thread name for glthread
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-24 10:00:19 +11:00
Matt Turner
7499bc7fd7 i965: Replace OPT_V() with OPT().
We want to be able to check the progress of each pass and dump the NIR
for debugging purposes if it changed.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
1be91bd9d8 i965/fs: Return progress from demote_sample_qualifiers().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
fd3351246c i965/fs: Return progress from move_interpolation_to_top().
And mark as static at the same time.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
e0f8daeb86 i965: Return progress from brw_nir_lower_uniforms().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
ef71af7356 nir: Return progress from nir_convert_from_ssa().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
abc8a702d0 nir: Return progress from nir_lower_io().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
a934b00222 nir: Return progress from nir_lower_regs_to_ssa().
And from nir_lower_regs_to_ssa_impl() as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
b0e72defc2 nir: Return progress from nir_lower_samplers().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
01548f9f01 nir: Return progress from nir_lower_atomics().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
0bd615d961 nir: Return progress from nir_lower_clamp_color_outputs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
9dbf91f5c0 nir: Return progress from nir_lower_clip_fs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
4e4927cd95 nir: Return progress from nir_lower_clip_vs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
6077cc75aa nir: Return progress from nir_move_vec_src_uses_to_dest().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
a539e05d00 nir: Return progress from nir_lower_to_source_mods().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
5a7e4ae23d nir: Return progress from nir_lower_clip_cull_distance_arrays().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
19345fc160 nir: Return progress from nir_lower_var_copies().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
b831b8d2e1 nir: Return progress from nir_lower_load_const_to_scalar().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
adb157ddfd nir: Return progress from nir_lower_64bit_pack().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
0012a6144a nir: Return progress from nir_lower_doubles().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
c597f87739 nir: Return progress from nir_lower_vars_to_ssa().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
7d41bf8d7b nir: Fix syntax.
et is not an abbreviation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
70c0455974 nir: Fix misspellings.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
d6e2bdfed3 nir: Stop using apostrophes to pluralize.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Leo Liu
54f9f34181 st/omx/enc: use PIPE_USAGE_STAGING for output buffer
Workaround an unknown bug with inside the transfer_map for certain
ASIC, also tested with un-affected ASICs, the performance actually
improved slightly.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-23 14:43:42 -04:00
Daniel Stone
378025ca8b gbm: Use unsigned for BO offset getter
The actual offset returned is uint32_t, however int64_t was used as the
return type from gbm_bo_get_offset to allow negative returns to signal
errors to the caller.

In case of an error getting the offset, the user will also be unable to
get the handle/FD, and thus have nothing to offset into. This means that
returning 0 as an error value is harmless, allowing us to change the
return type to uint32_t in order to avoid signed/unsigned confusion in
callers.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 15:28:41 +00:00
Eric Engestrom
ec0313fd58 REVIEWERS: add autogen.sh to the autoconf group
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-03-23 14:50:51 +00:00
Eric Engestrom
0adc9832f5 docs/submittingpatches: add mention about legal disclaimers
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-23 14:21:48 +00:00
Julien Isorce
48b5f1cca7 r600_shader.c: fix indentation
Introduced by ad13bd2e51

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-23 13:21:37 +00:00
Topi Pohjolainen
90633079ec glx: Prefer library path given by pkgconfig over the system
Recent change to use drmGetDevices2() made me realize that
build configured using

PKG_CONFIG_PATH=my_drm_lib_path/pkgconfig ./autogen.sh

considers the libdrm path gotten from pkgconfig only during
make. When invoking "make install" the relink command puts
system library ahead of the path gotten from pkgconfig
(and starts to fail as system libdrm isn't new enough).

This change forces the relink command to respect pkgconfig
settings.

It looks to me that in

https://bugs.freedesktop.org/show_bug.cgi?id=100259

with Emil et al considering it a libtool bug.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
[Emil Velikov: add inline comment]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-23 12:30:19 +00:00
Tapani Pälli
4f69573178 intel: move gen_decoder.* to DECODER_FILES
patch adds DECODER_FILES for libintel_common, this is so that platforms
such as Android not currently using this functionality can opt out.

Fixes: 7d84bb3 ("intel: Move tools/decoder.[ch] to common/gen_decoder.[ch].")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-23 14:05:19 +02:00
Tapani Pälli
bcae4eb502 android: fix vulkan build issues with anv_entrypoints
Patch fixes entrypoint generation for libmesa_anv_entrypoints that
still used old style of calling generator script.

Also small fixes to libmesa_vulkan_common where there was a typo
in target name (vulknan) and files were generated to wrong folder.

Fixes: 8211e3e6 ("anv: Generate anv_entrypoints header and code in one command")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-23 14:04:44 +02:00
Mauro Rossi
0ff8ac1b55 android: i965: generate code for OA counter queries
Automake generation rules are replicated for android.
$* macro was expected to return "hsw" but instead gives "hsw.{h,c}"
so $(basename $*) is used as a workaround
to set the correct --chipset option for brw_oa.py script.

Build tested with nougat-x86

Fixes: e565505 "i965: Add script to gen code for OA counter queries"
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Robert Bragg <robert@sixbynine.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-23 08:20:18 +02:00
Tapani Pälli
dc9ebc6ef1 android: rename Intel Vulkan library to match desktop one
Original naming was following Vulkan HAL naming scheme for no good
purpose and we need same binary name for build-id code.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-23 08:19:16 +02:00
Boyan Ding
51b7fae1ae nouveau: enable glsl/tgsi on-disk cache
v2: Fix argument to nouveau_screen_get_name()

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-22 22:51:35 -04:00
Eric Engestrom
e8875c7a87 REVIEWERS: add myself as a reviewer for EGL and docs
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-23 00:35:45 +00:00
Dylan Baker
4ee675d537 anv: Remove dead prototype from entrypoints
Spotted by Emil.

v2: - Add this patch

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
860beb99a6 anv: use cElementTree in anv_entrypoints_gen.py
It's written in C rather than pure python and is strictly faster, the
only reason not to use it that it's classes cannot be subclassed.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
9050138af7 anv: don't use Element.get in anv_entrypoints_gen.py
This has the potential to mask errors, since Element.get works like
dict.get, returning None if the element isn't found. I think the reason
that Element.get was used is that vulkan has one extension that isn't
really an extension, and thus is missing the 'protect' field.

This patch changes the behavior slightly by replacing get with explicit
lookup in the Element.attrib dictionary, and using xpath to only iterate
over extensions with a "protect" attribute.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
4d4697f868 anv: use dict.get in anv_entrypoints_gen.py
Instead of using an if and a check, use dict.get, which does the same
thing, but more succinctly.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
96a5f2a5ac anv: anv_entrypoints_gen.py: use reduce function.
Reduce is it's own reward.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
dd3830d11b anv: anv-entrypoints_gen.py: rename hash to cal_hash.
hash is reserved name in python, it's the interface to access an
object's hash protocol.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
8211e3e60d anv: Generate anv_entrypoints header and code in one command
This produces the header and the code in one command, saving the need to
call the same script twice, which parses the same XML file.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
383032c700 anv: anv_entrypoints_gen.py: directly write files instead of piping
This changes the output to be written as a file rather than being piped.
This had one critical advantage, it encapsulates the encoding. This
prevents bugs where a symbol (generally unicode like © [copyright]) is
printed and the system being built on doesn't have a unicode locale.

v2: - Update Android.mk
v3: - Don't generate both files at once
    - Fix Android.mk
    - drop --outdir, since the filename is passed in as an argument

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
a2a2bad2e2 anv: convert C generation to template in anv_entrypoints_gen.py
This produces a file that is identical except for whitespace, there is a
table that has 8 columns in the original and is easy to do with prints,
but is ugly using mako, so it doesn't have columns; the data is not
inherently tabular.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
0d8e22c5e4 anv: convert header generation in anv_entrypoints_gen.py to mako
This produces an identical file except for whitespace.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
ba1085c694 anv: Update "do not edit" comments with proper filename
This does two things, first it updates both the .h and the .c file to
have the same do not edit string. Second, it uses __file__ to ensure
that even if the file is moved or renamed that the name will be correct.

One thing to note is the use of '{{' and '}}' in the C template. This is
to instruct python to print a literal '{' and '}' respectively, rather
than treating the contents as a formatter specifier.

v3: - add this patch

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
ed9339bf26 anv: split main into two functions in anv_entrypoints_gen.py
This is groundwork for the next patches, it will allows porting the
header and the code to mako separately, and will also allow both to be
run simultaneously.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
1610b3dede anv: don't pass xmlfile via stdin anv_entrypoints_gen.py
It's slow, and has the potential for encoding issues.

v2: - pass xml file location via argument
    - update Android.mk

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
8017da8dd2 anv: make constants capitals in anv_entrypoints_gen.py
Again, it's standard python style.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
08a6d3b4ba anv: Use python style in anv_entrypoints_gen.py
These are all fairly small cleanups/tweaks that don't really deserve
their own patch.

- Prefer comprehensions to map() and filter(), since they're faster
- replace unused variables with _
- Use 4 spaces of indent
- drop semicolons from the end of lines
- Don't use parens around if conditions
- don't put spaces around brackets
- don't import modules as caps (ET -> et)
- Use docstrings instead of comments

v2: - Replace comprehensions with multiplication

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Dylan Baker
abd72f2e35 anv: anv_entrypoints_gen.py: use a main function
This is just good practice.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
2017-03-22 16:22:00 -07:00
Alex Smith
bc5d587a80 radv: Invalidate L2 for TRANSFER_WRITE barriers
CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write
through L2. Therefore, to make these writes visible to later accesses
we must invalidate L2 rather than just writing it back, to avoid the
possibility that stale data is read through L2.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-23 09:20:31 +10:00
Vinson Lee
bb32ea4fc6 glsl: Link glsl_compiler with CLOCK_LIB.
Fix linking error on CentOS 6.

  CXXLD  glsl_compiler
glsl/.libs/libstandalone.a(lt16-libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 14:58:18 -07:00
Timothy Arceri
6a9020f8dc util/disk_cache: use rand_xorshift128plus() to get our random int
Otherwise for apps that don't seed the regular rand() we will always
remove old cache entries from the same dirs.

V2: assume bits returned by rand are independent uniformly distributed
    bits and grab our hex value without taking the modulus of the whole
    value, this also fixes a bug where 'f' was always missing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-23 08:16:29 +11:00
Timothy Arceri
dd00a3c923 util/rand_xor: add function to seed rand
V2: pass the seed to the seed function so that we can isolate
    its uses. Stop leaking fd when urandom couldn't be read.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-23 08:16:29 +11:00
Timothy Arceri
53660c2366 util: move rand_xorshift128plus() to utils
V2: pass the seed to rand_xorshift128plus() so that we can isolate
    its uses.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-03-23 08:16:29 +11:00
Samuel Pitoiset
e11049f2c3 drirc: add force_glsl_abs_sqrt() for "Spec Ops: The Line"
Game ported from D3D9 which expects sqrt() to compute the absolute
value as explained in the spec.

This gets rid of the NaN values as well as the black squares
with RadeonSI.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97338
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-22 22:02:20 +01:00
Samuel Pitoiset
7a0ecbfffd st/glsl_to_tgsi: enable lower_sqrt() conditionally
It relies on the force_glsl_abs_sqrt driconf option.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-22 22:02:20 +01:00
Samuel Pitoiset
737c734cd4 glsl: lower sqrt(abs()) and inversesqrt(abs()) if requested
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-22 22:02:12 +01:00
Samuel Pitoiset
448f4c0c89 driconf: add force_glsl_abs_sqrt option
This will allow to force computing the absolute value for sqrt()
and inversesqrt() in order to follow D3D9 behaviour for buggy
apps that rely on it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-22 22:01:01 +01:00
Tim Rowley
08f864abd9 swr: [rasterizer jitter] fix llvm >= 5.0 build break
Function::getArgumentList() doesn't exist anymore, switch to using
arg_begin() (existed back to at least llvm-3.6.0).

Reviewed-by: Vedran Miletić <vedran@miletic.net>
CC: <mesa-stable@lists.freedesktop.org>
2017-03-22 13:45:35 -05:00
Rob Herring
7a5b5f5226 Android: drop Android 4.4 (KitKat) support
Any users of KitKat are likely using an older version of Mesa and
KitKat support adds complexity to the make files. Dropping support
allows removing the MESA_LOLLIPOP_BUILD make variable in various make
files.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:53:31 +00:00
Rob Herring
0e1ff22d55 Android: kill off {MESA_}ANDROID_VERSION defines aka Android 4.1 and older
The Android version defines are only needed for versions less than 4.2
which aren't really supported or tested.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:52:57 +00:00
Rob Herring
6bfad7c659 Android: fix libz dependency for host targets
Commit 6facb0c08f ("android: fix libz dynamic library dependencies")
added libz as a dependency, but this breaks host targets as the host
dependency is libz-host. As no host lib needs libz, just remove the
dependency for them.

Fixes: 6facb0c08f "android: fix libz dynamic library dependencies"
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:52:35 +00:00
Rob Herring
6f8f97a9b2 Android: remove host libmesa_util
The host libmesa_util is never used for Android builds, so remove it.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:52:23 +00:00
Rob Herring
5410c60112 Android: clean-up trailing '\' in make variables
Fixed with the following command:

perl -pe 'BEGIN{undef $/;} s/ \\\n\n/\n\n/smg' $(find . -name 'Android.*')

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:52:06 +00:00
Emil Velikov
50a9b0cb43 mesa/main: remove unused strndup.h include
No longer needed as of commit ac257f1070 ("glsl: calculate
TOP_LEVEL_ARRAY_SIZE and STRIDE when adding resources")

Reported-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:51:07 +00:00
Emil Velikov
68b545fa27 util: automake: beautify sources list
Remove trailing tabs and sort alphabetically.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:23 +00:00
Emil Velikov
e0129f3142 util/strndup: move header inclusion as applicable
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:23 +00:00
Emil Velikov
e325fc12db util: inline strndup implementation in the header
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:23 +00:00
Emil Velikov
d542d2fc13 util: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
43a9ca8eb4 mesa/program: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
f66fe28d9f mesa/main: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
2438c0a236 intel/compiler: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
868324419e intel/common: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
8c8761b237 i965: consistently use ifndef guards over pragma once
The only remaining case is the brw_oa.py generator which pipes the
generated file to stdout. That will be resolved with later commits.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b04916285e st/wgl: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
1385e58805 egl/dri2: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b27a883205 spirv: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
e3de145fa2 nir: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b08aee305e glsl: consistently use ifndef guards over pragma once
Through the glsl headers we had an odd mix of guards be that
"ifndef", "pragma once" neither or both.

Simplify things by using the more common ones (ifndef) and annotating
all the sources, barring the generated builting header -
builtin_int64.h.

The final header - udivmod64.h - is [seemingly] unused and on its way
out (patch purge it is on the mailing list).

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b0bfb5f89c compiler: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Emil Velikov
b9d035e75b radv: consistently use ifndef guards over pragma once
Namely: annotate the single file which is not using a ifndef guard -
vk_format.h

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Emil Velikov
95ab07c586 ac: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Emil Velikov
3b277bae66 i965: make brw_setup_image_uniform_values static
Used only internally.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Emil Velikov
7e79e895a6 docs/releasing: do not pass any arguments to autogen.sh
This should just work (tm) with the default options. Plus the one we
pass is already the default, so just drop it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-03-22 16:55:21 +00:00
Emil Velikov
559ca99ce1 mesa: more unused linux/version.h include
The header provides the LINUX_VERSION_CODE and KERNEL_VERSION macros.
With neither of which being used by any part of mesa.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 16:55:21 +00:00
Marek Olšák
84012262ea ac: fix build with LLVM 5.0svn
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-22 17:54:42 +01:00
Marek Olšák
6e2b9fd071 gallivm: remove lp_add_attr_dereferenceable in favor of amd/common
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-22 17:54:40 +01:00
Jason Ekstrand
7ab03ba725 anv/device: Move push descriptor query handling
The query is a properties query so it needs to be handled in
GetPhysicalDeviceProperties2, not GetPhysicalDeviceFeatures2.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-22 09:44:54 -07:00
Jason Ekstrand
c942faf8f3 anv/image: Return early when unbinding an image
Found by inspection.

Reviewed-by: Chad Versace <chadversary@chromium.org>
 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-22 09:44:54 -07:00
Grazvydas Ignotas
10d3702a36 util/sha1: harmonize _mesa_sha1_* wrappers
Rather than using 3 different ways to wrap _mesa_sha1_*() to SHA1*()
functions (a macro, prototype with implementation in .c and an inline
function), make all 3 inline functions.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 11:33:51 +00:00
Emil Velikov
64b9a37c3b anv: android: remove unused include/vulkan include
Spotted while skimming through similar hunks for the Autotools build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-03-22 11:33:40 +00:00
Emil Velikov
1fa6a33e4d anv: automake: use the local headers over any system provided ones
At the moment, we would honour any system headers - vulkan_intel.h in
particular over the ones in-tree.

Thus, if one does incremental build of mesa, without the vulkan.h
already installed (or at least not in the same directory as
vulkan_intel.h) the build will fail.

In the future we might want to upstream the vulkan_intel.h within
vulkan.h or use other ways to make vulkan_intel.h obsolete. In either
case, the more robust thing is to rely on our own copy.

v2: Move AM_CPPFLAGS just above LIBDRM_CFLAGS (Grazvydas, Jason)

Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Fixes: ee8044fd "intel/vulkan: Get rid of recursive make"
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 11:32:54 +00:00
Nicolai Hähnle
c11dcfb5e9 mesa/main: fix MultiDrawElements[BaseVertex] validation of primcount
primcount must be a GLsizei as in the signature for MultiDrawElements
or bad things can happen.

Furthermore, an error should be flagged when primcount is negative.

Curiously, this code used to work somewhat correctly even when primcount
was negative, because the loop that checks count[i] would iterate out of
bounds and almost certainly hit a negative value at some point.

Found by an ASAN error in
GL45-CTS.gtf32.GL3Tests.draw_elements_base_vertex.draw_elements_base_vertex_primcount

Note that the OpenGL spec seems to have s/primcount/drawcount/ at some
point, and the code still reflects the old language.

v2: provide the correct spec quotes (pointed out by Ian)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-03-22 12:12:29 +01:00
Nicolai Hähnle
c2dfff280b mesa: Avoid out-of-bounds stack read via _mesa_Materiali
MATERIALFV may end up reading up to 4 floats from the passed parameter.

This should really set a GL_INVALID_ENUM error in the cases where it
matters, but does anybody really care?

Found by ASAN in piglit gl-1.0-beginend-coverage.

v2: fix a trivial compiler warning

Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
2017-03-22 12:12:11 +01:00
Vinson Lee
bd6f0dcafc configure.ac: Do not strip away space after regex word match.
Fixes: 62c48ccb41 ("configure.ac: Use POSIX compatible regex for word boundary.")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-22 00:30:22 -07:00
Vinson Lee
62c48ccb41 configure.ac: Use POSIX compatible regex for word boundary.
Fixes build error on Mac OS X.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100236
Suggested-by: Jan Beich <jbeich@freebsd.org>
Suggested-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 23:56:17 -07:00
Chad Versace
44ac618a41 isl: Refactor row pitch calculation (v2)
The calculations of row_pitch, the row pitch's alignment, surface size,
and base_alignment were mixed together. This patch moves the calculation
of row_pitch and its alignment to occur before the calculation of
surface_size and base_alignment.

This simplifies a follow-on patch that adds a new member, 'row_pitch',
to struct isl_surf_init_info.

v2:
  - Also extract the row pitch alignment.
  - More helper functions that will later help validate the row pitch.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> (v2)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2017-03-21 15:56:16 -07:00
Chad Versace
c2b706f8af isl: Drop misplaced comment about padding
isl has a giant comment that explains the hardware's padding
requirements. (Hint: Cache lines and page faults). But the comment is in
the wrong place, in isl_calc_linear_row_pitch(), which is unrelated to
padding.

The important parts of that comment were copied to
isl_apply_surface_padding() long ago. So drop the misplaced comment.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 15:56:13 -07:00
Ben Widawsky
0e55e46540 i965/dri: Turn on support for image modifiers
All the plumbing is in place so the extension just needs to be
advertised.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:12 -07:00
Ben Widawsky
cd6bd7f123 i965/dri: Handle X-tiled modifier
This doesn't really "do" anything because the default tiling for the
winsys buffer is X tiled. We do however want the X tiled modifier to
work correctly from the API perspective, which would imply that if you
set this modifier, and later do a get_modifier, you get back at least X
tiled.

Running with a modified kmscube, here are the bandwidth measurements.

Linear:
Read bandwidth: 1039.31 MiB/s
Write bandwidth: 1453.56 MiB/s

Y-tiled:
Read bandwidth: 458.29 MiB/s
Write bandwidth: 542.12 MiB/s

X-tiled:
Read bandwidth: 575.01 MiB/s
Write bandwidth: 606.25 MiB/s

Cc: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:12 -07:00
Ben Widawsky
7ce0405826 i965/dri: Handle Y-tiled modifier
This patch begins introducing how we'll actually handle the potentially
many modifiers coming in from the API, how we'll store them, and the
structure in the code to support it.

Prior to this patch, the Y-tiled modifier would be entirely ignored. It
shouldn't actually be used until this point because we've not bumped the
DRIimage extension version (which is a requirement to use modifiers).

Measuring later in the series with kmscube:
Linear:
Read bandwidth: 1048.44 MiB/s
Write bandwidth: 1483.17 MiB/s

Y-tiled:
Read bandwidth: 471.13 MiB/s
Write bandwidth: 589.10 MiB/s

Similar functionality was introduced and then reverted here:

commit 6a0d036483
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Apr 21 20:14:58 2016 -0700

    i965: Always use Y-tiled buffers on SKL+

v2: Use last set bit instead of first set bit in modifiers to address
bug found by Daniel Stone.

v3: Use the new priority modifier selection thing. This nullifies the
bug fixed by v2 also.

v4: Get rid of modifier compaction which originally served another
purpose and now serves none (Jason)

Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:12 -07:00
Ben Widawsky
d78a36ea62 i965/dri: Handle the linear fb modifier
At image creation create a path for dealing with the linear modifier.
This works exactly like the old usage flags where __DRI_IMAGE_USE_LINEAR
was specified.

During development of this patch series, it was decided that a lack of
modifier was an insufficient way to express the required modifiers. As a
result, 0 was repurposed to mean a modifier for a LINEAR layout.

NOTE: This patch was added for v3 of the patch series.

v2: Rework the algorithm for modifier selection to go from a bitmask
based selection to this priority value.

v3: Make DRM_FORMAT_MOD_INVALID allowed at selection as a way of
identifying no modifiers found (because 0 is LINEAR) (Jason)

v4: Remove the logic to prune unknown modifiers (like those from other
vendors) and simply handle is in select_best_modifier (Jason)

Requested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:12 -07:00
Ben Widawsky
79f619ca70 i965/dri: Enable modifier queries
New to the patch series after reordering things for landing smaller
chunks.

This will essentially enable modifiers from clients that were just
enabled in previous patches. A client could use the modifiers by
setting all of them at create, but had no way to actually query them
after creating the surface (ie. stupid clients could be broken before
this patch, but in more ways than this).

Obviously, there are no modifiers being actually stored yet - so this
patch shouldn't do anything other than allow the API to get back 0 (or
the LINEAR modifier).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:12 -07:00
Ben Widawsky
fc1e9f0cb2 i965/dri: Store the screen associated with the image
I intend to need to get to the devinfo structure, and storing the screen
is an easy way to do that.

It seems to be the consensus that you cannot share an image between
multiple screens.

Scape-goat: Rob Clark <robdclark@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:11 -07:00
Ben Widawsky
2a16de9e4b gbm: Disallow INVALID modifiers returned upon image creation
v2: Add a TODO about modifier validation (Jason)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:11 -07:00
Ben Widawsky
962b31da95 i965/dri: Disallow image with INVALID modifier
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-21 14:48:03 -07:00
Kenneth Graunke
b15038a289 i965: Shut up major()/minor() warnings.
Recent glibc generates this warning:

brw_performance_query.c:1648:13: warning: In the GNU C Library, "minor" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "minor", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "minor", you should undefine it after including <sys/types.h>.

    min = minor(sb.st_rdev);

So, include sys/sysmacros.h to shut up the warning.

v2: Use the AC_HEADER_MAJOR defines to figure out the right header
    (thanks to Jonathan Gray for helping me not break non-glibc systems)

Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Reviewed-by: Emil Velikov <emli.velikov@collabora.com>
2017-03-21 14:10:17 -07:00
Kenneth Graunke
0c3fbf8028 i965: Drop AUB_TRACE_* stuff.
This was used for aubdumping (deleted a while ago) and INTEL_DEBUG=bat
decoding (deleted recently).

While we're changing parameters, delete the wrapper macro and make the
actual function brw_state_batch instead of __brw_state_batch.

This subsumes a patch by Emil Velikov to drop this from BLORP.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:49:18 -07:00
Kenneth Graunke
705c38e96f i965: Use aubinator/genxml for INTEL_DEBUG=bat state decoding.
This deletes all of our handwritten code in favor of autogenerated
genxml-based decoding.  This should be much more usable, as the old
code isn't entirely accurate - we updated some things for new
generations, but not everything.

Aubinator has one annoying limitation: it has no idea how many entries
to print when encountering e.g. 3DSTATE_BINDING_TABLE_POINTERS_VS.  It
picks an arbitrary number, which may skip decoding valid data, and may
print extra garbage entries.

We do a better job here by making brw_state_batch track the size of the
data stored at a particular batchbuffer offset.  Then, we can divide by
the structure size to obtain the exact number of entries.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:49:15 -07:00
Kenneth Graunke
5fab46572f i965: Use aubinator/genxml for INTEL_DEBUG=bat commands.
This should give substantially better decoding, as the public libdrm
decoder hasn't been properly maintained in years.

For now, we reuse the existing state dumping mechanism.  We'll improve
that in the next patch.

To avoid increasing the size of the driver, we restrict this feature
to debug builds of Mesa.  There's probably very little use for it in
release builds anyway.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:49:13 -07:00
Kenneth Graunke
7d84bb32aa intel: Move tools/decoder.[ch] to common/gen_decoder.[ch].
This way they become part of libintel_common.la so I can use them in
the i965 driver.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:49:10 -07:00
Kenneth Graunke
2b074bb7e5 intel: Add a INTEL_DEBUG=color option.
This will be used for color output in debug messages.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-21 13:48:53 -07:00
Vinson Lee
1fa432741c nir: Add positional argument specifiers.
Fix build with Python < 2.7.

  File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module>
    from nir_opcodes import opcodes
  File "src/compiler/nir/nir_opcodes.py", line 178, in <module>
    unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size),
ValueError: zero length field name in format

Fixes: 762a6333f2 ("nir: Rework conversion opcodes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-21 13:38:00 -07:00
Julien Isorce
ad13bd2e51 r600_shader.c: check returned value of eg_get_interpolator_index
Like done in another place in that same file.

CID 1250588

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-21 18:14:26 +00:00
Timothy Arceri
020b3f0c46 util/disk_cache: fix build on platforms where shader cache is disabled 2017-03-21 11:51:03 +11:00
Grazvydas Ignotas
b9a370f2b4 util/disk_cache: add a write helper
Simplifies the write code a bit and handles EINTR.

V2: (Timothy Arceri) Drop EINTR handling. To do it
    properly we would need a retry limit but it's
    probably best to just avoid trying to write if
    we hit EINTR and try again next time we see
    the program.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:51:03 +11:00
Grazvydas Ignotas
af73acca2b tests/cache_test: use the blob key's actual first byte
There is no need to hardcode it, we can just use blob_key[0].
This is needed because the next patches are going to change how cache
keys are computed.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Grazvydas Ignotas
529a767041 util/disk_cache: use a helper to compute cache keys
This will allow to hash additional data into the cache keys or even
change the hashing algorithm easily, should we decide to do so.

v2: don't try to compute key (and crash) if cache is disabled

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Dave Airlie
021c87fa24 radv: move KHR_get_physical_device_properties2 to instance props.
This is an instance property not a device one.

Fixes:
dEQP-VK.api.info.device.extensions

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-21 10:05:49 +10:00
Dave Airlie
93e62898cc radv: drop illegal DB format error.
We'll get this if we have a stencil only setup.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-21 10:05:49 +10:00
Kenneth Graunke
72c89522c2 i965: Add autogenerated OA files to .gitignore. 2017-03-20 16:28:04 -07:00
Tim Rowley
fe325e6423 swr: [rasterizer] Cleanup naming of codegen files
All template files and generated files are prefixed with gen_.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
cf8fa67364 swr: [rasterizer codegen] Remove BOM from knob_defs.py
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
8a5069e81f swr: [rasterizer codegen] Rewrite gen_llvm_types.py to use mako
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
5d0b3b05a2 swr: [rasterizer codegen] Fix generation of knobs
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
4ed72758db swr: [rasterizer codegen] Change backend template comment style
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
2776d94545 swr: [rasterizer codegen] Rewrite gen_llvm_ir_macros.py to use mako
Don't create/use cpp files, header only now.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
9538ba9bd1 swr: [rasterizer codegen] Quiet gen_backends.py execution
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
97cbabc8fb swr: [rasterizer scripts] Put codegen scripts into a separate directory
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:54 -05:00
Tim Rowley
7046695a0e swr: [rasterizer core] Fix trifan regression from 9d3442575f
Fixes piglit triangle-rasterization-overdraw.

SIMD16 path not working.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:05:22 -05:00
Tim Rowley
4cb69e817c swr: [rasterizer core] SIMD16 Frontend WIP - fix tesselation crashes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
ab3f4449c3 swr: [rasterizer jitter] Fix LogicOp blend jit after assert changes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
8cd8240cfc swr: [rasterizer] Convert more SWR_ASSERT(false, ...) to SWR_INVALID(...)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
ab032fb436 swr: [rasterizer core] Fix typo in SIMD16 code path
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
d011ba74ee swr: [rasterizer core/common] Fix the native AVX512 build under ICC
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
2f513d8d83 swr: [rasterizer core] Allow no arguments to SWR_INVALID macro
Turns out this is somewhat tricky with gcc/g++.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
0b066b2bf3 swr: [rasterizer] Slight assert refactoring
Make asserts more robust.

Add SWR_INVALID(...) as a replacement for SWR_ASSERT(0, ...)

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
f445b6de9c swr: [rasterizer] Backend code adjustments
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
e4d1294afb swr: [rasterizer archrast] Fix the early and late depthstencil events
The coverage and stencil mask arguments were reversed.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
a508c2c2ac swr: [rasterizer core] Implement double pumped SIMD16 TESS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
2cbac00221 swr: [rasterizer archrast/core/scripts] Fix archrast multithreading issue
Per pixel stats are cached but were not always being flushed as threads
moved from one draw context to the next.  Added an explicit flush to allow
all archrast objects to flush any cached events.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
0a36a7cf04 swr: [rasterizer archrast] Remove redundant data from archrast files
If count can be derived from other counts then this can be done in
post processing scripts.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
1cc885d1d1 swr: [rasterizer archrast/scripts] Further archrast cleanups
Removed redundant data being written out to file

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
1399fbd6fd swr: [rasterizer core] Fix RECT_LIST primitive assembly
The bug would make the 3rd component of attributes on the second
triangle of a RECT be invalid.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
ade5351900 swr: [rasterizer common] Add InterpolateComponentFlat utility
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
ab04221bf1 swr: [rasterizer archrast] Fix performance issue with archrast stats
Performance is now 50x faster with archrast now that we're properly
filtering out all of the rdtsc begin/end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
b228d2db18 swr: [rasterizer core] Implement SIMD16 GS and STREAMOUT
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
5830a0a6f8 swr: [rasterizer archrast] Add additional API events
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
d2759c1eb3 swr: [rasterizer core/scripts] Autogen backend initialization function(s)
Autogen functions that instantiates different BackendPixelRate templates.
Functions get split into separate files after reaching a user defined
threshold (currently 512 per file) to speed up compilation.

This change will enable the addition of more template flags in the pixel
back end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
2c820d22cf swr: [rasterizer core] backend.h declares gBackendPixelRateTable
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
50d491e22d swr: [rasterizer core] Finish SIMD16 PA OPT including tesselation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
9d3442575f swr: [rasterizer core] Finish SIMD16 PA OPT except tesselation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Tim Rowley
7b94e5e1fa swr: [rasterizer core] Support sparse numa id values on all OSes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-20 18:04:53 -05:00
Kenneth Graunke
5e29af5f77 i965: Skip register write detection when possible.
Detecting register write support by trial and error introduces a
stall at screen creation time, which it would be nice to avoid.
Certain command parser versions guarantee this will work (see the
giant comment in intelInitScreen2 below, or a few commits ago):

- Ivybridge: version >= 1 (kernel v3.16)
- Baytrail:  version >= 2 (kernel v3.19)
- Haswell:   version >= 7 (kernel v4.8)

For simplicity, we don't bother with version 1 in this patch.

This assumes that the user hasn't disabled aliasing PPGTT via a kernel
command line parameter.  Don't do that - you're only breaking things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke
31693a13f8 i965: Set screen->cmd_parser_version to 0 if we can't write registers.
If we can't write registers, then the effective command parser version
is 0 - it may exist, but it's not usefully enabling anything.

See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8)
where the kernel starts doing this for us.  This makes us do more or
less the same thing on older kernels.

This should preserve a bit of sanity by allowing us to perform a
screen->cmd_parser_version > N check to determine that we really can
use the features promised by command parser version N.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke
4a2ad6b145 i965: Document the sad story of the kernel command parser.
This should help us figure out the complexities of which kernel
versions we need to get various features on various platforms.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Kenneth Graunke
9b324e4dca i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.
In commit d2590eb65f I enabled GL 4.5
on Haswell...but failed to check if we could do indirect compute
shader dispatch...and query buffer objects.

Indirect compute shader dispatch requires command parser version 5
(kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
Linux v4.4).  On earlier kernels we would have disabled
ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.

Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
which mean command parser version 7 (Linux v4.8).  On earlier kernels
we would have disabled ARB_query_buffer_object, which is a mandatory
part of OpenGL 4.4+.

The new version support looks like:

- Kernel 4.1 and older => OpenGL 3.3
- Kernel 4.2-4.3       => OpenGL 4.2
- Kernel 4.4-4.7       => OpenGL 4.3
- Kernel 4.8+          => OpenGL 4.5

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-20 15:58:05 -07:00
Constantine Kharlamov
99d400b78f r600g/sb: Fix memory leak by reworking uses list (rebased)
The author is Heiko Przybyl(CC'ing), the patch is rebased on top of Bartosz Tomczyk's one per Dieter Nützel's comment.
Tested-by: Constantine Charlamov <Hi-Angel@yandex.ru>

v2: Resend the patch again through git-email. The prev. rebase was sent
through Thunderbird, which screwed up tab characters, making the patch
not apply.

--------------
When fixing the stalls on evergreen I introduced leaking of the useinfo
structure(s). Sorry. Instead of allocating a new object to hold 3 values
where only one is actually used, rework the list to just store the node
pointer. Thus no allocating and deallocation is needed. Since use_info
and use_kind aren't used anywhere, drop them and reduce code complexity.
This might also save some small amount of cycles.

Thanks to Bartosz Tomczyk for finding the bug.

Reported-by: Bartosz Tomczyk <bartosz.tomczyk86 at gmail.com <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Signed-off-by: Heiko Przybyl <lil_tux at web.de <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>>
Supersedes: https://patchwork.freedesktop.org/patch/135852
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-03-20 23:23:50 +01:00
Marek Olšák
827ae79b2c radeonsi: check the IR type before waiting for a compute compilation fence
This should fix OpenCL getting stuck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100288
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-20 23:17:14 +01:00
Kenneth Graunke
4084083124 aubinator: Move the guts of decode_group() to decoder.c.
This lets us use it outside of the aubinator binary itself.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
aa1ef0b984 aubinator: Drop spec parameter to decode_group().
No longer necessary - the iterator gets it from the group.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
b2c0c1d9a5 aubinator: Make the iterator store a pointer to structure descriptions.
When the iterator encounters a structure field, it now looks up the
gen_group for that structure definition and saves a pointer to it.

This lets us drop a lot of ridiculous code in the caller, which looked
at item->value (<struct NAME dword>), strtok'd the structure name back
out, and looked it up itself.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
a1aa78cb45 aubinator: Track the current field's starting dword offset.
The iterator code already computed this value, then we stored it in
the structure name, strtok'd it back out, and also manually computed
it when printing dword headers.

Just put the value in the struct and use it.  Way simpler.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
e6f7357cab aubinator: Drop decode_structure() helper.
It made more sense when decode_group() took a bunch of extra options,
but now that there's only one...we may as well pass 0 and call it a day.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
a8d4184b00 aubinator: Drop unused print_dword_headers flag.
I added this flag in 65a9d5eabb but
it was completely unused.  Both callers appear to have printed dword
headers, so we can just drop the flag and continue doing it
unconditionally.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
7f21cb56b8 aubinator: Store a pointer from gen_group back to gen_spec.
When decoding a structure field within a group, we may want to look up
that structure type.  Having a gen_spec pointer makes it easy to do so.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Kenneth Graunke
2c6c760a4b aubinator: Store enum textual name in iter->value.
gen_field_iterator_next() produces a string representing the value of
the field.  For enum values, it also produced a separate "description"
string containing the textual name of the enum.

The only caller of this function combines the two, printing enums as
"<numeric value> (<texture enum name>)".  We may as well just store
that in item->value directly, eliminating the description field, and
a layer of wrapping.

v2: Use non-overlapping source and destination strings in snprintf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-20 11:20:51 -07:00
Julien Isorce
a6e2124402 si_descriptor: move velems nullity check before dereference
CID 1399479: Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking velems suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:51 +00:00
Julien Isorce
521860b2a9 radeon_drm_bo: explicitly check return value of drmCommandWriteRead
CID 1313492

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:51 +00:00
Julien Isorce
dac124466a si_pipe: remove nullity check after dereference
sscreen cannot be NULL

CID 1354483

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 18:01:41 +00:00
Julien Isorce
ce27b27c38 radeon: initialize hole variable before calling container_of
Like in a few other places in that radeon_drm_bo.c file.

CID 715739.

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 16:47:31 +00:00
Nanley Chery
7c50f9903f intel: Correct the BDW surface state size
The PRMs state that this packet is 16 DWORDS long. Ensure that the last
three DWORDS are zeroed as required by the hardware when allocating a
null surface state.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-20 09:43:44 -07:00
Bartosz Tomczyk
f4b23589da r600g: Fix out of bounds access
fc_sp variable should indicate number of elements in
fc_stack array, but fc_sp was increased at beginning of fc_pushlevel
function. It leads to situation where idx=0 was never used, and last
32 element was stored outside fs_stack array.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 17:32:53 +01:00
Constantine Kharlamov
f9190f3e65 r600g: update sb documentation
v2: s/r600/r600g in the title

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-20 17:11:15 +01:00
Constantine Kharlamov
64cbbd2888 r600g: make condition clearer
The second check in the old code looked pretty much unreachable, esp.
because it's not obvious that "max_entries" could be zero. To find out
that it was intentional I had to run some checks, and to dig into
the old versions of the file.

So, rewrite the check to make the intention clear.

v2: s/r600/r600g in the title, and per Dieter Nützel's comment wrap
lines of condition.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-03-20 17:11:15 +01:00
Emil Velikov
36e029d356 docs: add news item and link release notes for 13.0.6/17.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-20 14:25:18 +00:00
Emil Velikov
54fd78f637 docs: add sha256 checksums for 17.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 9b66351f5b)
2017-03-20 14:20:32 +00:00
Emil Velikov
887ad468b5 docs: add release notes for 17.0.2
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 373d88a711)
2017-03-20 14:20:31 +00:00
Emil Velikov
9bad99742f docs: add sha256 checksums for 13.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 879d24c497)
2017-03-20 14:20:26 +00:00
Emil Velikov
0babb9e091 docs: add release notes for 13.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit fcef88d13a)
2017-03-20 14:20:25 +00:00
Xu,Randy
57595cb073 anv/genX: Solve the vkCreateGraphicsPipelines crash
The crash is due to NULL pColorBlendState, which is legal if the
pipeline has rasterization disabled or if the subpass of the render pass
the pipeline is created against does not use any color attachments.

Test: Sample subpasses from LunarG can run without crash

Signed-off-by: Xu,Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-20 08:31:18 +02:00
Dave Airlie
e70e7cc7ff radv: fix logic for when to flush on multiple CS emission
The current code evaluated to always true, we only want to flush
on the first submit. Rename the variable to do_flush, and only
emit on the first iteration.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20 14:17:43 +10:00
Jason Ekstrand
fcca6a83cd spirv: Implement IsInf using an integer comparison
Since we already do fabs on the one source, we're guaranteed to get
positive infinity if we get any infinity at all.  Since +inf only has
one IEEE 754 representation, we can use an integer comparison and avoid
all of the ordered/unordered issues.

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-20 14:08:19 +10:00
Dave Airlie
e0208949d1 radv/meta: fix image clears for r4g4 format.
This just uses an 8-bit clear and packs the values.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20 13:41:31 +10:00
Dave Airlie
10c2b588c4 Revert "radv: fallback to an in-memory cache when no pipline cache is provided"
This reverts commit 2845a108a9.

This break VK-GL-CTS randomly.
./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4*

bounces around here from 6/6 to 3/6 or 4/6 to hanging.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20 13:41:31 +10:00
Timothy Arceri
72fa447d45 mesa: disable glthread when glNewList() is called
glNewList() swaps dispatch tables, and we don't have anything in
place to handle that in glthread.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-03-20 10:22:20 +11:00
Dave Airlie
d06e168b87 radv: fix primitive reset index emission
This was meant to be checking the index type to get the correct
index not the last emitted one. This fixes:
dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20 08:47:03 +10:00
Grazvydas Ignotas
274aaa331c util/disk_cache: check rename result
I haven't seen this causing problems in practice, but for correctness
we should also check if rename succeeded to avoid breaking accounting
and leaving a .tmp file behind.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:24:46 +11:00
Grazvydas Ignotas
67911fa4b8 util/disk_cache: delete .tmp if target exists
At the time of target file check, .tmp file is already created and file
lock is held, so we should remove the .tmp, like in other error paths.

With this, piglit no longer leaves large amount of empty .tmp files
behind, which waste directory entries and may interfere with eviction.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:24:38 +11:00
Grazvydas Ignotas
bd93cea691 util/disk_cache: fix stored_keys index
It seems there is a bug because:
- 20 bytes are compared, but only 1 byte stored_keys step is used
- entries can overlap each other by 19 bytes
- index_mmap is ~1.3M in size, but only first 64K is used

With this fix for Deus Ex:
- startup time (from launch to Feral logo): ~38s -> ~16s
- disk_cache_has_key() hit rate: ~50% -> ~96%

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-20 08:14:31 +11:00
Ilia Mirkin
663e7c25f5 nv30: create uploader after pipe->screen is set
Fixes crashes after recent upload rework.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-19 01:24:06 -04:00
Ilia Mirkin
0e9232dbcc nv50,nvc0: enable TEX_LZ and TXF_LZ
There should be minimal gain, if any, for nvc0, but nv50 may end up
noticing more often that the lod argument is uniform. This, in turn,
will remove the need for some unnecessary transformations, which were
being hit due to the checks being done pre-ssa.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-18 20:37:52 -04:00
Ilia Mirkin
dab88e9af7 st/mesa: set result writemask based on ir type
This prevents textureQueryLevels, which maps as LODQ, from ending up
with a xyzw writemask, which is illegal.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18 20:16:45 -04:00
Karol Herbst
09f16de7e6 nvc0/ir: treat FMA like MAD for operand propagation
Helps mainly Feral-ported games, due to their use of fma()

shader-db changes:
total instructions in shared programs : 3901147 -> 3842505 (-1.50%)
total gprs used in shared programs    : 471258 -> 467359 (-0.83%)
total local used in shared programs   : 27405 -> 27361 (-0.16%)
total bytes used in shared programs   : 35749888 -> 35214176 (-1.50%)

                local        gpr       inst      bytes
    helped          17        1829        4091        4091
      hurt           4          44           3           3

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-03-18 20:15:45 -04:00
Alan Swanson
a7eb7984bf util/disk_cache: pass predicate functions file stats directly (v4)
Since switching to LRU eviction the only user of these predicate
functions now resolves directory entry stats itself so pass them
directly saving calling fstat and strlen twice (and the
expensive strlen is skipped entirely if access time is newer).

v2: Update for empty cache dir detection changes
v3: Fix passing string length to predicate with the +1 for NULL
    termination and also pass sb as pointer
v4: Missed ampersand for passing sb as pointer

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-18 14:32:57 +11:00
Timothy Arceri
bf8bc6190e glsl: use set for copy propagation kills
Previously each time we saw a variable we just created a duplicate
entry in the list. This is particularly bad for loops were we add
everything twice, and then throw nested loops into the mix and the
list was growing expoentially.

This stops the glsl-vs-unroll-explosion test which has 16 nested
loops from reaching the tests mem usage limit in this pass. The
test now hits the mem limit in opt_copy_propagation_elements()
instead.

I suspect this was also part of the reason this pass can be so
slow with some shaders.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-03-18 14:21:09 +11:00
Timothy Arceri
9e42b93f33 st/dri: wait for thread to finish before unbinding context
Fixes a bunch of piglit crashes that hit an assert() when trying
to delete the framebuffer. The assert() was triggered because
WinSysDrawBuffer was set to NULL before glDeleteFramebuffers()
was called.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-18 14:15:52 +11:00
Timothy Arceri
40bc1afc94 glsl: don't leak memory when trying to count loop iterations
Suggested-by: Damian Dixon <damian.dixon@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
2017-03-18 14:12:40 +11:00
Jason Ekstrand
1d5f4f46da genxml: Make MI_STORE_DATA_IMM have a single 64-bit data field
This is way more convenient than having two separate dword fields.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 15:31:19 -07:00
Jason Ekstrand
ced61fd53e anv: Turn on inherited queries
It all just works since it's just a hardware register so we might as
well turn it on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Ilia Mirkin
e675f57d4f anv: Implement pipeline statistics queries
In the end, pipeline statistics queries look a lot like occlusion
queries only with between 1 and 11 begin/end pairs being generated
instead of just the one.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
dda54890f3 anv: Disable VF statistics for blorp and SOL memcpy
In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer.  There are two possible approaches to this:

 1) Disable before the blit/memcpy and re-enable afterwards

 2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
    part of pipeline state and then just disabale statistics before
    blits and memcpy operations.

Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take.  We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
9576cea519 anv/pipeline: Enable clipper statistics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
2a616242cd genxml: s/Clipper Statistics Enable/Statistics Enable/
It's in 3DSTATE_CLIP, so it doesn't really need the extra detail.  This
matches what we do for VS, FS, etc.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
149d10d38a anv/query: Rework store_query_result
The new version is a nice GPU parallel to cpu_write_query_result and it
nicely handles things like dealing with 32 vs. 64-bit offsets in the
destination buffer.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
c773ae88df anv/query: Break GPU query calculation into a helper
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
7de73f0c94 genxml: Add pipeline statistics registers on gen7+
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
0557dfdb4a anv/query: Add a helper for writing a query pool result
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
bce4a935c6 anv/query: Use a variable-length slot size
Not all queries are the same.  Even the two queries we support today
require a different amount of data per slot.  Once we introduce pipeline
statistics queries, the size will vary wildly.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:49 -07:00
Jason Ekstrand
1c797af2c6 anv/query: Move the available bits to the front
We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:47 -07:00
Jason Ekstrand
9d43afa3dc anv/query: Let 32-bit values wrap
From the Vulkan 1.0.39 Specification:

   "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
   32-bit value, the value may either wrap or saturate."

So we can either clamp or wrap.  Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent.  We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:11:35 -07:00
Alex Deucher
c2a97fb7ae radeonsi: add new polaris12 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-17 14:13:17 -04:00
Marek Olšák
4b064d16e5 gallium/radeon: formalize that create_batch_query doesn't need pipe_context
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
be6173e7d6 gallium/radeon: formalize that create_query doesn't need pipe_context
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
04e6977e5d gallium/radeon: reference pipe_resource in pipe_transfer
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
03127bb6d5 radeonsi: compile all TGSI compute shaders asynchronously
required by threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
e9c6953ddb radeonsi: require that compiler threads are enabled
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
080f322f06 trace: remove leftover assertions after pipe_resource wrapping removal 2017-03-17 18:30:21 +01:00
Marek Olšák
6c0a28084d gallium/u_upload: make the first persistent mapping unsynchronized
This is simpler for drivers.
2017-03-17 18:30:21 +01:00
Robert Bragg
a27b62e794 anv/device: init timestampPeriod from devinfo
Now that there's a timebase_scale in gen_device_info which is
effectively the 'period' this switches anv_GetPhysicalDeviceProperties
to using this common device info to initialize the timestampPeriod
device limit.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-17 16:10:22 +00:00
Robert Bragg
344d1a4015 i965: Allow a per gen timebase scale factor
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock
with the convenient property of being able to scale by an integer (80)
to nanosecond units.

For Skylake the frequency is 12MHz or a scale factor of 83.333333

This updates gen_device_info to track a floating point timebase_scale
factor and makes corresponding _queryobj.c changes to no longer assume a
scale factor of 80 works across all gens.

Although the gen6_ code could have been been left alone, the changes
keep the code more comparable, and it now shares a few utility functions
for scaling raw timestamps and calculating deltas. The utility for
calculating deltas takes into account 32 or 36bit overflow depending on
the current kernel version.

Note: this leaves the timestamp handling of ARB_query_buffer_object
untouched, which continues to use an incorrect scale of 80 on Skylake
for now. This is more awkward to solve since the scaling is currently
done using a very limited uint64 ALU available to the command parser
that doesn't support multiply or divide where it's already taking a
large number of instructions just to effectively multiple by 80.

This fixes piglit arb_timer_query-timestamp-get on Skylake

v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-17 15:45:19 +00:00
Jason Ekstrand
28b134c75c anv/device: Remove a use of a compound literal
Older versions of GCC don't like compound literals in static const
variable declarations because they don't think it's an actual constant
value.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 08:40:30 -07:00
Robert Bragg
76dc49f3fb i965: bounds checks while concatenating sysfs paths
This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoid the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.

This issue with strncpy was picked up by Coverity.

CID: 1402201
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 13:40:29 +00:00
Emil Velikov
f8b1b9404e mesa: automake: add all headers to the tarball.
Fixes: d8d81fbc31 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Emil Velikov
d9a41ce8aa mapi: automake: add all python scripts to EXTRA_DIST
Otherwise it'll be missing in the tarball and make distcheck will fail.

Fixes: 05dd4a1104 ("glapi: Generate GL API marshalling code from the XML.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Jonathan Gray
9e8d6ba1d6 glapi: avoid using $< in non-suffix make rules
Using $< in non-suffix make rules is a GNU extension.  Explicitly use
the name of the python script to fix the build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabore.com>
2017-03-17 13:06:26 +00:00
Alex Smith
ce4058dafd radv/ac: Fix shared memory offset calculation
The index passed to get_shared_memory_ptr is an attribute slot index,
i.e. the index of a vec4 within LDS. Therefore this must be scaled by
sizeof(vec4) to give the LDS byte offset.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
2017-03-17 09:35:48 +01:00
James Legg
e88cac1df0 radv: Fix using more than 4 bound descriptor sets
Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when
using more than 4 descriptor sets. radv claims support for 8.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 09:12:43 +01:00
Tapani Pälli
70d25cae8b util/build-id: check dlpi_name before strstr call
According to dl_iterate_phdr man page first object visited is the
main program where dlpi_name is an empty string. This fixes segfault
on Android when using build-id as identifier.

Fixes: d4fa083e11 ("util: Add utility build-id code.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-17 07:34:26 +02:00
Tapani Pälli
4d4558411d android: fix segfault within swap_buffers
Function droid_swap_buffers may get called without dri2_surf->buffer set,
in these cases we don't have a back buffer set either. Patch fixes segfault
seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI.

backtrace:
   #00 pc 00013f88  /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104)
   #01 pc 000117b2  /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50)
   #02 pc 000058b2  /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386)
   #03 pc 00011329  /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553)
   #04 pc 000118e7  /system/lib/libEGL.so (eglSwapBuffers+55)
   #05 pc 000754dc  /system/lib/libandroid_runtime.so

v2: do like other backends, call get_back_bo (Emil Velikov)

Fixes: 2acc69d ("EGL/Android: Add EGL_EXT_buffer_age extension")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 07:30:34 +02:00
Timothy Arceri
72ab7bb765 radv: make sure gs copy shader is retrieved from the cache with the variant
Apps can limit the size of the cache via VkAllocationCallbacks so we
can't be sure that both are always in the cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
2845a108a9 radv: fallback to an in-memory cache when no pipline cache is provided
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
315e8a9321 radv: always create an fallback pipeline cache
This will be used as an in-memory cache when a pipeline cache is
not provided by the app.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
4ffdab78b9 radv: move cache check inside insert and search functions
This will allow us to use fallback in-memory and on-disk caches
should the app not provide a pipeline cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
124ec417f9 st/mesa: call glthread_destroy() before _vbo_DestroyContext()
Otherwise we have a race condition between vbo calls in the
glthread and the _vbo_DestroyContext() call.

This fixes a bunch of piglit crashes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-17 09:47:02 +11:00
Jason Ekstrand
08df015b9d anv/GetQueryPoolResults: Actually implement the spec
The Vulkan spec is fairly clear about when we should and should not
write query pool results.  We're also supposed to return VK_NOT_READY if
VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries
which are not yet finished.  This fixes rendering corruptions on The
Talos Principle where geometry flickers in and out due to bogus query
results being returned by the driver.  These issues are most noticable
on Sky Lake GT4 2hen running on "ultra" settings.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:18 -07:00
Jason Ekstrand
81840130c0 anv/query: Invalidate the correct range
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
4bbb4b95b8 anv/query: Fix the location of timestamp availability
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
9e60f59e62 genxml: Add XML version tags
There's not much point to having them or not having them but this
reduces some pointless diff from the version we can auto-generate

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 15:08:17 -07:00
Kenneth Graunke
f51a320b12 aubinator: Use fprintf for output.
This will make it easier to choose an output file.  For now, it remains
stdout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:44 -07:00
Kenneth Graunke
65a9d5eabb aubinator: Reuse decode_structure code for handling commands
The code for decoding structures and commands was almost identical.
The only differences are: we print dword headers for commands, and
we skip the first one (with the command opcode and lengths).

So, generalize decode_structure to add a starting DWord, and a flag
for printing the DWord headers, and reuse it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:41 -07:00
Kenneth Graunke
f0aa8fd4e4 aubinator: Delete redundant NULL check.
handle_struct_decode() is just a wrapper around decode_structure()
with a NULL check.  But the only caller already does that NULL check.

So, just use decode_structure() directly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:37 -07:00
Kenneth Graunke
65138ce019 aubinator: Fix indentation.
Three space, not four.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:32 -07:00
Topi Pohjolainen
bd25d9670b i965/gen8+: Do full stall when switching pipeline
just as earlier gens do.

CC: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 18:44:15 +02:00
Jonathan Gray
46707bc27b i965: remove uneeded asm/unistd.h include
Fix the build on OpenBSD by removing an uneeded include for asm/unistd.h.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:56:40 +00:00
Emil Velikov
e6bef50f4c i965: automake: remove spurious white space
Unintentionally introduced by yours truly with the i965 compiler move.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:55:42 +00:00
Jonathan Gray
d2bb0c8590 i965: avoid using a GNU make pattern rule
% pattern rules are a GNU extension.  As there is only one file here
avoid patterns and globbing entirely to fix the build on non-GNU make.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
v2 [Emil Velikov: brw_oa.py dependency]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:55:23 +00:00
Emil Velikov
ccb89e72aa docs/releasing: document how to squash/announce queued patches
In the odd case where a patch needs to be fixed, squash the appropriate
fix and document how. Add a note in the pre-release notes, such that
devs can quickly spot it.

v2: Grammar/typo fixes (Eric). Use upstream commit [SHA] as reference.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:22:40 +00:00
Emil Velikov
0f988add50 docs/releasing: release.sh is located in xorg/util-modular
Correct the silly typo s/macros/modular/ and add a reference to the
repository.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:18:13 +00:00
Emil Velikov
79562033b5 docs/releasing: remove "git clean" step
release.sh from master, does not require the tree to be clean.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:18:11 +00:00
Emil Velikov
c81c563fbb mapi: remove Xlib/xcb include in gl_marshal.py
The only use of the header is to provide the _X_INLINE macro. We already
require (and provide where needed) 'inline', plus it's used in the file
already.

So replace the macro and drop the include. This fixes the build on
platforms which lack the header - from X-less Linuxes to Androids.

Fixes: 05dd4a1104 ("glapi: Generate GL API marshalling code from the XML.")
Reported-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100223
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:12:26 +00:00
Eric Engestrom
8a82f551cd docs/specs: update Khronos registries URLs
The registries were migrated to git and are now hosted on GitHub.
The old svn is now read-only, and will not be updated anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-16 11:50:40 +00:00
Iago Toral Quiroga
ca34a3125f anv: improve error reporting when creating pipelines
Specifically, report 'out of memory' errors that might have happened while
emitting the pipeline's batch.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
1d7468311d anv: handle errors in emit_binding_table() and emit_samplers()
These can fail to allocate device memory, however, the driver can recover
from this error by allocating a new binding table block and trying again.

v2:
  - Instead of tracking the errors in these functions and making callers
    reset the batch's status before attempting to allocate a new block
    for the binding table, simply make callers responsible for setting
    the error status if they fail to allocate memory during the second
    attempt (Jason).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
dd8348c8be anv: handle errors while allocating new binding table blocks
Also, we had a couple of instances in flush_descriptor_sets() were
we were returning a VkResult directly upon error, but the return
value of this function is not a VkResult but a uint32_t dirty mask,
so simply return 0 in these cases which reduces the amount of
work the driver will do after the error has been raised.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
be52f9693a anv/blorp: make anv_cmd_buffer_alloc_blorp_binding_table() return a VkResult
Instead of asserting inside the function, and then use use that information
to return early from its callers upon failure.

v2:
  - Make sure that clear_color_attachment() and
    clear_depth_stencil_attachment() get the VkResult as well so they
    avoid executing the batch if an error happened. (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
a578b06d7b anv/device: assert that commands submitted to a queue are not bogus
Any errors that may have happened during the command buffer recording are
reported by vkEndCommandBuffer() and it is the application's reponsibility
to not submit broken commands to a queue.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
a752c4ecda anv/cmd_buffer: skip vkCmdExecuteCommands() on broken command buffers
v2: Assert on secondary commands, applications should've called
    vkEndCommandBuffer() and received an error for them before (Jason)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
801493051e anv/cmd_buffer: skip vkCmdDispatch() on broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
18ec3fa2a9 anv/cmd_buffer: skip vkCmdDraw*() on broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
fb9d563fb9 anv: handle memory allocation errors during queue submissions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
c04dbd6b3e anv/cmd_buffer: handle out of memory during vkCmdPushConstants
Fixes:
dEQP-VK.api.out_of_host_memory.cmd_push_constants

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
94a4f0c255 anv/cmd_buffer: handle allocation errors during vkCmdBeginRenderPass()
Fixes:
dEQP-VK.api.out_of_host_memory.cmd_begin_render_pass

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d823f381a5 anv/cmd_buffer: skip vkCmdEndRenderPass() for broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
6743456699 anv/cmd_buffer: skip vkCmdNextSubpass() for broken command buffers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
8174e63869 anv/cmd_buffer: report tracked errors in vkEndCommandBuffer()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
68d88f0237 anv: handle failures when growing reloc lists
Growing the reloc list happens through calling anv_reloc_list_add() or
anv_reloc_list_append(). Make sure that we call these through helpers
that check the result and set the batch error status if needed.

v2:
  - Handling the crashes is not good enough, we need to keep track of
    the error, for that, keep track of the errors in the batch instead (Jason).
  - Make reloc list growth go through helpers so we can have a central
    place where we can do error tracking (Jason).

v3:
  - Callers that need the offset returned by anv_reloc_list_add() can
    compute it themselves since it is extracted from the inputs to the
    function, so change the function to return a VkResult, make
    anv_batch_emit_reloc() also return a VkResult and let their callers
    do the error management (Topi)

v4:
  - Let anv_batch_emit_reloc() return an uint64_t as it originally did,
    there is no real benefit in having it return a VkResult.
  - Do not add an is_aux parameter to add_surface_state_reloc(), instead
    do error checking for aux in add_image_view_relocs() separately.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d4bdd871dc anv: avoid crashes when failing to allocate batches
Most of the time we use macros that handle this situation transparently,
but there are some cases were we need to handle this explicitly.

This patch makes sure we don't crash, notice that error handling takes
place in the function that actually failed the allocation,
anv_batch_emit_dwords(), which will set the status field of the batch
so it can be used at a later moment to report the error to the user.

v2:
  - Not crashing is not good enough, we need to keep track of the error
    (Topi, Jason). Iago: now that we track errors in the batch, this
    is being handled.
  - Added guards in a few more places that needed it (Iago)

v3:
  - Check result of anv_batch_emitn() for NULL before calling memset()
    in emit_vertex_input() (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
31f5049ff1 anv: handle allocation failure in anv_batch_emit_dwords()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
9e69409fcf anv: handle allocation failure in anv_batch_emit_batch()
v2:
 - Call the error handler (Topi)

Fixes:
dEQP-VK.api.out_of_host_memory.cmd_execute_commands

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
a8ce8e3542 anv: add anv_batch_set_error() and anv_batch_has_error() helpers
The anv_batch_set_error() helper will track the first error that happened
while recording a command buffer. The helper returns the currently tracked
error to help the job of internal functions that may generate errors that
need to be tracked and return a VkResult to the caller.

We will use the anv_batch_has_error() helper to guard parts of the driver
that are not safe to execute if an error has been generated while recording
a particular command buffer.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
d0195bd067 anv/cmd_buffer: add a status field to anv_batch
The vkCmd*() functions do not report errors, instead, any errors should be
reported by the time we call vkEndCommandBuffer(). This means that we
need to make the driver robust against incosistent and/or imcomplete
command  buffer states through the command recording process, particularly,
avoid crashes due to access to memory that we failed to allocate previously.

The strategy used to do this is to track the first error ocurred while
recording a command buffer in the batch associated with it. We use the
batch to track this information because the command buffer may not be
visible to all parts of the driver that can produce errors we need to be
aware of (such as allocation failures during batch emissions).

Later patches will use this error information to guard parts of the driver
that may not be safe to execute.

v2: Move the field from the command buffer to the batch so we can track
    errors from batch emissions (Jason)

v3: Registering errors in the command buffer's batch during
    anv_create_cmd_buffer() is unnecessary, since the command buffer
    is freed at the end of the function in that case (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
6dd06f54eb anv/cmd_buffer: report errors in vkBeginCommandBuffer()
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
88b539c4a0 anv: do not try to ref/unref NULL shaders
This situation can happen if we failed to allocate memory for the shader.

v2:
 - We shouldn't see NULL shaders in anv_shader_bin_ref so we should not check
   for that (Jason). Make sure that callers don't attempt to call this
   function with a NULL shader and assert that this never happens (Iago).

v3:
 - All callers to anv_shader_bin_unref seem to check for NULL before calling,
   so just assert that it is not NULL (Topi)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
bad3a2e911 anv/blorp: return early if we failed to create the shader binary
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
e2f707ce5b intel/blorp: make upload_shader() return a bool indicating success or failure
For now we always return true, follow-up patches will handle fail scenarios.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
808503b8f8 anv: remove unnecessary function prototype.
The function is defined right after the prototype declaration. Also, the
protoype for it is included in anv_genX.h which is included via anv_private.h.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Timothy Arceri
04a9ca2700 mapi: don't include X11/Xlib-xcb.h on non PTHREAD platforms
Should fix the last of the glthread build issues on windows.
2017-03-16 15:45:40 +11:00
Timothy Arceri
4a32d473fd mesa: fix glthread marshal build issues on platforms without PTHREAD 2017-03-16 15:33:08 +11:00
Timothy Arceri
643b0fd7e9 mesa: fix glthread build issues on platforms without PTHREAD 2017-03-16 14:48:09 +11:00
Marek Olšák
c83562ccaa gallium: implement the backend of threaded GL dispatch
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Gregory Hainaut
93bdad3253 mesa/glthread: restore the dispatch table when incompatible gl calls are detected
While a context only has a single glthread, the context itself can be
attached to several threads. Therefore the dispatch table must be
updated in all threads before the destruction of glthread. In others
words, glthread can only be destroyed safely when the context is deleted.

Fixes remaining crashes in the glx-multithread-makecurrent* tests.

V2: (Timothy Arceri) updated gl_API.dtd marshal_fail description.

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Gregory Hainaut
70e715eea6 mesa/glthread: don't set a dispatch table if we aren't the owner
Fix crashes when glxMakeCurrent is called.

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
012bfebc07 mesa: Track the current vertex/element array buffers for glthread.
We want to support glthread on GLES contexts with reasonable apps, and on
desktop for apps that use VBOs but haven't completely moved to core GL.
To do so, we have to deal with the "the user may or may not pass user
pointers to draw calls" problem.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
238d027ed6 mesa: Disable glthread when glBegin() is called.
glBegin() swaps dispatch tables, and we don't have any code in place for
handling that in glthread (which also messes with dispatch tables), and I
don't particularly care to at this point.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
cd1c003b18 mesa: Add an attribute for conditions to turn off threading.
The threading for GL core is in place, but there are so few applications
actually using a core GL context that it would be nice to extend support
back.  However, some of the features of compat GL (particularly user
vertex arrays) would be so expensive to track state for that we want to be
able to disable threading when we discover that the app is using them.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
43d4f7a227 mesa: Add support for asynchronous glDraw* on GL core.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
b18755a457 mesa: Add support for NULL arguments like in glBufferData() in marshalling.
This will let us support things like glBufferData() that should be
asynchronous.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
47f819d3cb mesa: Statically allocate glthread command buffer in the batch struct.
This avoids an extra pointer dereference in the marshalling functions,
which, with the instruction count doing in the low 30s, could actually
matter for main-thread performance.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Eric Anholt
1d6b71c5c6 glapi: Mark vertex attrib pointer functions as async.
These don't actually read data out of the pointers, they set the
pointers (or offsets in a VBO) to be used in a later draw call.

v2: Don't forget glVertexAttribIPointer, and don't bother with annotations
    on aliases.
v3: Mark CompressedTexSubImage1D as sync also.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:19 +11:00
Paul Berry
a4a5de6f18 mesa: Custom thread marshalling for Flush.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
154a4f2679 mesa: Custom thread marshalling for ShaderSource.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Eric Anholt
efd63e234a mesa: Connect the generated GL command marshalling code to the build.
v2: Rebase on the Begin/End changes, and just disable this feature on
    non-GL-core.
v3: (Timothy Arceri) enable for non-GL-core contexts. Remove
    unrelated safe_mul() hunk. while loop style fix.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Marek Olšák
db06e91de2 Revert "mesa: make _mesa_alloc_dispatch_table() static"
This reverts commit 4009d22b61.

glthread needs it.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
ef30ce97a6 mesa: Create pointers for multithread marshalling dispatch table.
This patch splits the context's CurrentDispatch pointer into two
pointers, CurrentClientDispatch, and CurrentServerDispatch, so that
when doing multithread marshalling, we can distinguish between the
dispatch table that's being used by the client (to serialize GL calls
into the marshal buffer) and the dispatch table that's being used by
the server (to execute the GL calls).

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Eric Anholt
d8d81fbc31 mesa: Add infrastructure for a worker thread to process GL commands.
v2: Keep an allocated buffer around instead of checking for one at the
    start of every GL command.  Inline the now-small space allocation
    function.
v3: Remove duplicate !glthread->shutdown check, process remaining work
    before shutdown.
v4: Fix leaks on destroy.
V5: (Timothy Arceri) fix order of source files in makefile

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Eric Anholt
a76a3cf664 mesa: Validate count parameters when marshalling.
Otherwise, for example, glDeleteBuffers(-1, &bo) gets you a segfault
instead of GL_INVALID_VALUE.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
05dd4a1104 glapi: Generate GL API marshalling code from the XML.
This is not yet used in the build, just generated.

v2: Add missing build dependencies.
v3: Avoid mixing declarations and code, remove logic for avoiding emitting
    code that the compiler's optimizer can deal with anyway.
v4: (Timothy Arceri) move safe_mul() genereation here from a later patch.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Eric Anholt
f05524ffaa glapi: Mark compressed teximage functions as sync.
Without doing some additional tracking, we won't know whether the data
will be immediate user data, or will be loaded from a PBO.  The normal
teximage functions will be sync by default because they don't know up
front what the size of their image data is.  But for compressed teximage,
we have the count information, so they would end up async by default.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
f5052f45a2 glapi: Annotate functions with "marshal" attribute.
Several API functions require special treatment in order to be marshalled
to a background thread.  Others can't be safely executed in a background
thread and need to be executed synchronously (e.g. since they return data
through a pointer argument).

This annotation will be used when code generating thread marshalling code,
to ensure that each function is marshalled in the correct way.

Note that PixelMap functions are marked as synchronous for now since
their pointer may be relative to buffer on the GPU, so we'll need
special logic to marshal them properly.

v2: Move description of attribute types to a comment in the dtd file.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Eric Anholt
3b7b6adf3a egl: Implement __DRI_BACKGROUND_CALLABLE
v2: (Timothy Arceri) use C99 initializers.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
6b70d9fce3 glx: Implement __DRI_BACKGROUND_CALLABLE
v2: Marek: Add DRI3 support.

v3: (Timothy Arceri) use C99 initializers.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
77630841da mesa: Add SetBackgroundContext to dd_function_table
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
5bc527d39d dri: Update dri_util to keep track of __DRI_BACKGROUND_CALLABLE
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Paul Berry
e043b2a1a0 dri_interface: Add new marshalling interfaces to dri_interface.h
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Marek Olšák <maraeo@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-16 14:14:18 +11:00
Roland Scheidegger
e1f9e9bafd gallivm: (trivial) remove duplicated line
pointed out by clang (stored value never read)
2017-03-16 04:03:29 +01:00
Roland Scheidegger
9d104dfd55 draw: (trivial) remove a unnecessary lp_build_alloca()
pointed out by clang (stored value never read)
2017-03-16 04:03:29 +01:00
Ilia Mirkin
e893b3a367 swr: support layer output in geometry shaders
This makes bin/gl-3.2-layered-rendering-gl-layer-render fail only with
2DMS_ARRAY, which is expected given the lackluster MSAA support. However
all the regular types pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-15 21:03:11 -04:00
Bas Nieuwenhuizen
ad4dee521d Revert "radv: Emit cache flushes before CP DMA."
This reverts commit cce43f6d8c.

Redundant, as the flush already happens at si_cp_dma_prepare.

Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-16 00:55:03 +01:00
Francisco Jerez
e6469ec43b gallium/tgsi: Treat UCMP sources as floats to match the GLSL-to-TGSI pass expectations.
Currently the GLSL-to-TGSI translation pass assumes it can use
floating point source modifiers on the UCMP instruction.  See the bug
report linked below for an example where an unrelated change in the
GLSL built-in lowering code for atan2 (e9ffd12827)
caused the generation of floating-point ir_unop_neg instructions
followed by ir_triop_csel, which is translated into UCMP with a negate
modifier on back-ends with native integer support.

Allowing floating-point source modifiers on an integer instruction
seems like rather dubious design for a transport IR, since the same
semantics could be represented as a sequence of MOV+UCMP instructions
instead, but supposedly this matches the expectations of TGSI
back-ends other than tgsi_exec, and the expectations of the DX10 API.
I take no responsibility for future headaches caused by this
inconsistency.

Fixes a regression of piglit glsl-fs-tan-1 on softpipe introduced by
the above-mentioned glsl front-end commit.  Even though the commit
that triggered the regression doesn't seem to have made it to any
stable branches yet, this might be worth back-porting since I don't
see any reason why the bug couldn't have been reproduced before that
point.

Suggested-by: Roland Scheidegger <sroland@vmware.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99817
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-15 15:47:14 -07:00
Grazvydas Ignotas
eb5a61f77a util/disk_cache: do eviction before creating .tmp
cache_put() first creates a .tmp file and then tries to do eviction.
The recently added LRU eviction code selects non-empty directory with
the oldest access time, but that may easily be the one with just the
new .tmp file, especially on Linux where atime is updated lazily
(with "relatime" mount option, which is the default). So when cache is
small, if random doesn't hit another dir LRU keeps selecting the same
dir with just the .tmp and not deleting anything. To fix this (and the
tests), do eviction earlier.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-16 09:36:18 +11:00
Tim Rowley
a7ce0490e4 swr: validate backend state numAttributes
General protection and prevents us from smashing the stack
on the first clear state validation (a7b8d50bcb).  Fixes crash
using icc.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-15 15:08:59 -05:00
Ben Widawsky
8378c576ab gbm: Export a get modifiers
This patch originally had i965 specific code and was named:
commit 61cd3c52b868cf8cb90b06e53a382a921eb42754
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Thu Oct 20 18:21:24 2016 -0700

    gbm: Get modifiers from DRI

To accomplish this, two new query tokens are added to the extension:
__DRI_IMAGE_ATTRIB_MODIFIER_UPPER
__DRI_IMAGE_ATTRIB_MODIFIER_LOWER

The query extension only supported 32b queries, and modifiers are 64b,
so we needed two of them.

NOTE: The extension version is still set to 13, so none of this will
actually be called.

v2: Error handling of queryImage (Emil)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-15 10:36:05 -07:00
Ben Widawsky
5c6e0d1c7d i965: introduce modifier selection.
Nothing special here other than a brief introduction to modifier
selection. Originally this was part of another patch but was split out
from
gbm: Introduce modifiers into surface/bo creation by request of Emil.

Requested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-15 10:36:05 -07:00
Ben Widawsky
191ff914a2 egl/drm: Use modifiers for backbuffer creation
Split into a separate patch from the previous patch as requested by
Emil.

Requested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-15 10:36:05 -07:00
Ben Widawsky
63bd2ae745 gbm: Introduce modifiers into surface/bo creation
The idea behind modifiers like this is that the user of GBM will have
some mechanism to query what properties the hardware supports for its BO
or surface. This information is directly passed in (and stored) so that
the DRI implementation can create an image with the appropriate
attributes.

A getter() will be added later so that the user GBM will be able to
query what modifier should be used.

Only in surface creation, the modifiers are stored until the BO is
actually allocated. In regular buffer allocation, the correct modifier
can (will be, in future patches be chosen at creation time.

v2: Make sure to check if count is non-zero in addition to testing if
calloc fails. (Daniel)

v3: Remove "usage" and "flags" from modifier creation. Requested by
Kristian.

v4: Take advantage of the "INVALID" modifier added by the GET_PLANE2
series.

v5: Don't bother with storing modifiers for gbm_bo_create because that's
a synchronous operation and we can actually select the correct modifier
at create time (done in a later patch) (Jason)

v6: Make modifier condition outside the check so that dri_use will work
properly (Jason)

Cc: Kristian Høgsberg <krh@bitplanet.net>
References (v4): https://lists.freedesktop.org/archives/intel-gfx/2017-January/116636.html
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-15 10:36:05 -07:00
Ben Widawsky
5e7d8d3961 i965: Implement basic modifier image creation
This is just a stub for now and will be filled in later.

This was split out of an earlier patch

Requested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-15 10:36:05 -07:00
Ben Widawsky
d075cce258 dri: Add an image creation with modifiers
Modifiers will be obtained or guessed by the client and passed in during
image creation/import. In guessing, a client might decide to simply pass
along all known modifiers

This requires bumping the DRIimage version.

As of this patch, the modifiers aren't plumbed all the way down, this
patch simply makes sure the interface level stuff is correct.

v2: Don't allow usage + modifiers

v3: Make NAND actually NAND. Bug introduced in v2. (Jason)

v4:
- s/obtains/obtained (Jason)
- Pull out i965 imlemnentation into a later patch (Emil)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-15 10:36:04 -07:00
Marek Olšák
0550f3d631 radeonsi: implement TGSI opcodes TEX_LZ and TXF_LZ
This massively decreases VGPR spilling for DiRT Showdown, because we
no longer have to use v4i32 for 2D fetches when level == 0.
We now use v2i32 for those cases.

DiRT Showdown - Spilled VGPRs: -26 (-81%)

This surprisingly doesn't have any useful effect on performance (+ 0.05%).
2017-03-15 18:17:41 +01:00
Marek Olšák
a7cc9b0fcf glsl_to_tgsi: use TEX_LZ and TXF_LZ when available 2017-03-15 18:17:41 +01:00
Marek Olšák
46cbb00f53 glsl_to_tgsi: remove a redundant statement
it's the same as the last "else".
2017-03-15 18:17:41 +01:00
Marek Olšák
cca0389c72 gallium: add TGSI opcodes TEX_LZ and TXF_LZ
for better code generation in radeonsi
2017-03-15 18:17:41 +01:00
Marek Olšák
bf3cdf0fd3 gallium: add PIPE_CAP_TGSI_TEX_TXF_LZ 2017-03-15 18:17:41 +01:00
Samuel Pitoiset
7751ed39e4 radeonsi: disable sinking common instructions down to the end block
Initially this was a workaround for a bug introduced in LLVM 4.0
in the SimplifyCFG pass that caused image instrinsics to disappear
(because they were badly sunk). Finally, this is a win because it
decreases SGPR spilling and increases the number of waves a bit.

Although, shader-db results are good I think we might want to
remove it in the future once the issue is fixed. For now, enable
it for LLVM >= 4.0.

This also fixes a rendering issue with the speedometer in Dirt Rally.

More information can be found here https://reviews.llvm.org/D26348.

Thanks to Dave Airlie for the patch.

v2: - add a FIXME comment
    - use if (HAVE_LLVM >= 0x0400) instead

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-15 14:24:40 +01:00
Samuel Pitoiset
74265fd03c tgsi: add missing compute shader entry in tgsi_get_processor_name()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-15 14:16:29 +01:00
Samuel Pitoiset
38ee3246d2 radeonsi: clean up tex_fetch_ptrs()
Will also help when the src sampler register will be
TGSI_FILE_CONSTANT for bindless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-15 14:16:26 +01:00
Emil Velikov
8a5680f248 configure.ac: bump pthread-stubs requirement
On platforms that require it, we bump the requirement to 0.4 or later.
Due to an issue with the project [design] any version earlier than it,
is bound to cause issues. For the specifics see the pthread-stubs README

Cc: Uli Schlachter <psychon@znc.in>
Cc: Jonathan Gray <jsg@jsg.id.au>
Cc: Jean-Sébastien Pédron <dumbbell@FreeBSD.org>
Cc: François Tigeot <ftigeot@wolfpond.org>
Cc: Tobias Nygren <tnn@NetBSD.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-03-15 11:49:27 +00:00
Emil Velikov
eec0cd71cd glx: don't expose systemTimeExtension for DRI2/DRI3/DRISW
Used/applicable to only dri1 drivers.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-03-15 11:48:50 +00:00
Emil Velikov
b1fb6e8d8c anv: do not open random render node(s)
drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're interested or not.

v2: Rebase, explicitly require/check for libdrm
v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia)
v4: Rebase

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:05 +00:00
Emil Velikov
743315f269 radv: do not open random render node(s)
drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're interested or not.

v2: Rebase.
v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia)

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:02 +00:00
Emil Velikov
8ff2937dfa radv/winsys: use drmGetDevice2 API
Analogous to previous commit

v2: Add explicit require_libdrm check.

Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:00 +00:00
Emil Velikov
858170e8a4 winsys/amdgpu: use drmGetDevice2 API
Analogous to previous commit

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:58 +00:00
Emil Velikov
a50c4eb2a0 loader: use drmGetDevice[s]2 API
By this allows us to fetch the device list/info w/o the revision field.
At the moment retrieving the latter wakes up the device.

Note: kernel patch to resolve that should be in 4.10.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:55 +00:00
Emil Velikov
2c72e78ff5 autoconf/scons: bump libdrm to 2.4.75
We'll be using the drmGetDevice[s]2 API in src/loader with next patch.

v2: Rebase.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:39 +00:00
Emil Velikov
0fd61fb639 util/sha1: drop _mesa_sha1_{update, format} return type
Unused/unchecked by any of the callers.

v2: Fix the glsl cases that have crept in since v1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:45 +00:00
Emil Velikov
a9a4028fd7 util/sha1: rework _mesa_sha1_{init,final}
Rather than having an extra memory allocation [that we currently do not
and act accordingly] just make the API take an pointer to a stack
allocated instance.

This and follow-up steps will effectively make the _mesa_sha1_foo simple
define/inlines around their SHA1 counterparts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:43 +00:00
Emil Velikov
c96127e873 util/sha1: add non-typedef name for the SHA1_CTX struct
Using typedef(s) is not always the answer and makes it harder for people
to do clever (or one might call nasty) things with the code.

Add a struct name which we will use with follow-up commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:53 +00:00
Bas Nieuwenhuizen
ef43eeb09f radv: Remove unused descriptor set field.
Trivial.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
2017-03-15 09:06:52 +01:00
Dave Airlie
686d060458 r600: refactor binding code for attach buffer to CB.
This refactors out the code and fixes it up to be used
for images later. It uses the code in the current RAT binding
for compute.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:26 +10:00
Dave Airlie
222e42e45f r600: refactor out CB setup.
This moves the code to create CB info out into
a separate function so it can be reused in images
code to create RATs.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:23 +10:00
Dave Airlie
0cf717821e r600: refactor texture resource words setup code.
This refactors out the code to setup a texture resource
so we can reuse it later from the images code.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:06 +10:00
Dave Airlie
95a976b651 r600: factor out the code to initialise a buffer resource.
This takes the code required to initialise a buffer resource
out of the texture buffer code, into it's own function.

This is going to be used for the image support later.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:32:48 +10:00
Dave Airlie
cf2af021b9 r600g: make framebuffer atom rely on dual src blend state.
In order to make ARB_shader_image_load_store, we have to share
the CB space with RATs, so we should only steal the dual src
space if we have dual src enabled.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:32:44 +10:00
Jason Ekstrand
d142c7436c intel/debug: Add a common INTEL_DEBUG=nohiz option
The GL driver had a driconf option (which doesn't make much sense) and
the Vulkan driver had a hand-rolled environment variable.  Instead,
let's tie both into the INTEL_DEBUG mechanism and unify things.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Jason Ekstrand
c09bb956ca anv/image: Move handling of INTEL_VK_HIZ
This makes it so that you don't get an "Implement gen7 HiZ" perf warning
when you manually disable HiZ on gen8.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Timothy Arceri
304b35b0e9 radv: trivial tidy ups
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-15 11:45:04 +11:00
Alan Swanson
b7e03d87e4 util/disk_cache: scale cache according to filesystem size
Select higher of current 1G default or 10% of filesystem where
cache is located.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Alan Swanson
f1e9671442 util/disk_cache: actually enforce cache size
Currently only a one in one out eviction so if at max_size and
cache files were to constantly increase in size then so would the
cache. Restrict to limit of 8 evictions per new cache entry.

V2: (Timothy Arceri) fix make check tests

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Alan Swanson
af09b86732 util/disk_cache: use LRU eviction rather than random eviction
Still using fast random selection of two-character subdirectory in
which to check cache files rather than scanning entire cache.

v2: Factor out double strlen call
v3: C99 declaration of variables where used

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
c2793e2c89 util/disk_cache: don't fallback to an empty cache dir on evict
If we fail to randomly select a two letter cache dir, don't select
an empty dir on fallback.

In real world use we should never hit the fallback path but it can
be hit by tests when the cache is set to a very small max value.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
50989f87e6 util/disk_cache: use a thread queue to write to shader cache
This should help reduce any overhead added by the shader cache
when programs are not found in the cache.

To avoid creating any special function just for the sake of the
tests we add a one second delay whenever we call dick_cache_put()
to give it time to finish.

V2: poll for file when waiting for thread in test
V3: fix poll delay to really be 100ms, and simplify the wait function

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
fc5ec64ba3 util/disk_cache: add helpers for creating/destroying disk cache put jobs
V2: Make a copy of the data so we don't have to worry about it being
freed before we are done compressing/writing.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
e2c4435b07 util/disk_cache: add thread queue to disk cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:10 +11:00
Dave Airlie
7372e3cf5f radv/ac: workaround regression in llvm 4.0 release
LLVM 4.0 released with a pretty messy regression, that hopefully
get fixed in the future.

This work around was proposed by Tom, and it fixes the CTS regressions
here at least, I'm not sure if this will cause any major side effects,
but correctness over speed and all that.

radeonsi should possibly consider the same workaround until an llvm
fix can be found.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 09:51:53 +10:00
Dave Airlie
3ece76f03d radv/ac: gather4 cube workaround integer
This fix is extracted from amdgpu-pro shader traces.

It appears the gather4 workaround for integer types doesn't
work for cubes, so instead if forces a float scaled sample,
then converts to integer.

It modifies the descriptor before calling the gather.

This also produces some ugly asm code for reasons specified
in the patch, llvm could probably do better than dumping
sgprs to vgprs.

This fixes:
dEQP-VK.glsl.texture_gather.basic.cube.rgba8*

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 09:51:53 +10:00
Bas Nieuwenhuizen
407fa77669 radv: Set driver version to mesa version;
I couldn't really find an encoding in the spec. I'm not sure it
prescribes VK_MAKE_VERSION format, but vulkan.gpuinfo.org interprets
it that way by default. vulkaninfo gives the raw number, so we could
alternatively do something like 17001000, but that doesn't show
up right on vulkan.gpuinfo.org again. Looking at that site, the -pro
driver also uses VK_MAKE_VERSION, so keeping consistency is probably
best.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-15 00:37:56 +01:00
Bas Nieuwenhuizen
ed28ae71f5 radv: Increase api version to 1.0.42.
I've skimmed to changes from 1.0.5 to 1.0.42 and I think we have all
changes. We're still not conformant ofcourse, but this should not
regress stuff,

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-15 00:37:56 +01:00
Jason Ekstrand
2e98db68e4 util/vk: Add helpers for finding an extension struct
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-15 08:22:02 +10:00
Alex Smith
e0cc32b85b radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer
Need to flush before updating the buffer to ensure that the copy is
ordered after previous accesses (assuming the app has performed the
appropriate barriers).

This fixes potential issues due to draws prior to an update reading
the new buffer content, despite having the necessary barriers between
them.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-14 22:17:03 +01:00
Bas Nieuwenhuizen
cce43f6d8c radv: Emit cache flushes before CP DMA.
The flushes could be due to TRANSFER barriers.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-14 22:16:34 +01:00
Jan Beich
fe56c745b8 Convert sed(1) syntax to be compatible with FreeBSD and OpenBSD
BSD regex library doesn't support extended RE escapes (e.g. \+) and
shorthand character classes (e.g. \s, \S) and SVR4-style word
delimiters[1] (on DragonFly and NetBSD). Both GNU and BSD sed support
-E and -r to enable extended RE but OS X still lacks -r.

[1] https://www.illumos.org/issues/516

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> (GNU sed)
2017-03-14 17:07:04 +00:00
Jason Ekstrand
aed2714145 anv: Properly enumerate physical devices when none are present 2017-03-14 09:08:07 -07:00
Jason Ekstrand
9d559ba39d nir/constant_expressions: Refactor helper functions
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
7107b32155 i965/fs: Re-arrange conversion operations
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
bab4610e9c i965/vec4: Get rid of the type parameter from to/from_double
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
702d1af8ba glsl/nir: Use nir_type_conversion_op
Using the helper is way better than hand-coding the universe.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
6eb051e36f nir: Rewrite nir_type_conversion_op
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs.  Let's just write
the obvious code and then we know it's correct.  This fixes a bunch of
missing cases particularly with int64.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
9084b1db30 nir: Add a get_nir_type_for_glsl_base_type helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
a136884139 nir/validate: Rework ALU bit-size rule validation
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes.  An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information.  The new validation code is much more straightforward and
should be more correct.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
370d68babc nir/validate: Validate that bit sizes and components always match
We've always required bit sizes to match but the rules for number of
components have been a bit loose.  You've never been allowed to source
from something with less components than you consume, but more has
always been fine.  This changes the validator to require that they match
exactly.  The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
e9a45a3d5d nir: Make image_size a variable-width intrinsic
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
b377be9213 i965/fs: Use num_components from the SSA def in image intrinsics
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
0bf0365393 nir/lower_tex: Use tex_instr_dest_size for txs destinations
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
fffa4111df nir/spirv: Restrict the number of channels in texture coordinates
Some SPIR-V texturing instructions pack more than the texture coordinate
into the coordinate source.  We need to mask off the unused channels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
3c312be7b3 nir/copy_prop: Respect the source's number of components
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced.  To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.

Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated.  Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.

Shader-db results on Sky Lake:

   total instructions in shared programs: 13318947 -> 13320265 (0.01%)
   instructions in affected programs: 260633 -> 261951 (0.51%)
   helped: 324
   HURT: 1027

Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions.  From a
spot-check of the shaders, the story is always the same:  They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate.  Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input.  This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town.  Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
60d1aac28a nir/intrinsics: Make load_barycentric_input take a 2-component coor
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
678fd00f2f anv/blorp: Only set a clear color for resolves if fast-cleared
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
273b720310 anv/blorp: Turn off AUX after doing a CCS_D resolve
For render passes with multiple subpasses on gen7, we only fast-clear at
the top but an input attachment use can cause us to do a resolve in the
middle of the render pass.  Once we've done so, we are no longer have a
fast-cleared surface so we can just set aux_usage to NONE.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Tapani Pälli
773d510c66 android: add '/vulkan' to libmesa_anv_entrypoints path
otherwise generated entrypoint headers are not found during build

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-14 07:48:30 +02:00
Tapani Pälli
4734322574 android: add src/intel/compiler to libmesa_intel_compiler includes
fixes build error when brw_nir.h not found in the generated file
brw_nir_trig_workarounds.c.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-14 07:48:22 +02:00
Gwan-gyeong Mun
8f22552a4f anv: Add missing error-checking to anv_CreateDevice (v3)
This patch adds missing error-checking and fixes resource leak in
allocation failure path on anv_CreateDevice()

v2: Fixes from Jason Ekstrand's review
  a) Add missing destructors for all of the state pools on allocation
     failure path
  b) Add missing destructor for batch bo pools on allocation failure path

v3: Fixes from Emil Velikov's review
  Add missing destructor for queue and scratch_pool on allocation failure
  path

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 21:29:43 -07:00
Dave Airlie
b8ee70384a radv: setup llvm target data layout
Ported from radeonsi, pointed out by Tom.

"This prevents LLVM from using sext instructions for local memory
offsets and allows the backend to fold immediate offsets into the
instruction. This also prevents some incorrect code generation for
ptrtoint and inttoptr instructions."

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <tstellar@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-14 10:33:59 +10:00
Alex Smith
c19607d59d radv: Reinitialise loaderMagic when allocating a cached command buffer
This must be set to ICD_LOADER_MAGIC by vkAllocateCommandBuffers, which
was being done when allocating a new buffer but not when reusing an
existing one in the cache. This would hit an assertion and crash in
debug builds of the Vulkan loader.

Fixes: 682248db45 ("radv: Cache command buffers in command pool.")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-13 23:42:36 +01:00
Marek Olšák
cdbe4990cd gallium/radeon: disable the shader cache if dumping shaders
otherwise, cached shaders aren't dumped.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-13 23:34:52 +01:00
Marek Olšák
71a2e4e945 radeonsi: mark all bound shader buffer ranges as initialized
This should prevent cases when a buffer was incorrectly mapped without
synchronization just because this wasn't done.

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-13 23:34:52 +01:00
Marek Olšák
686cd76a4c st/mesa: disable the shader cache if dumping shaders
otherwise, cached shaders aren't dumped.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-13 23:34:52 +01:00
Chad Versace
c5a0829e1f anv: Use vk_outarray in vkGetPhysicalDeviceQueueFamilyProperties
No intended change in behavior. Just a refactor.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 15:08:15 -07:00
Chad Versace
876f0ecd2f anv: Use vk_outarray in vkEnumeratePhysicalDevices (v2)
No intended change in behavior. Just a refactor.

v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For
    Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 15:08:15 -07:00
Chad Versace
62160536a0 util/vulkan: Add vk_outarray (v2)
This is a wrapper for a Vulkan output array. A Vulkan output array is
one that follows the convention of the parameters to
vkGetPhysicalDeviceQueueFamilyProperties().

v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For
    Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 15:08:11 -07:00
Lionel Landwerlin
bf47e5ba53 intel: genxml: prevent missing ; with address fields dwords
Before this change, the generator could print this kind of things :

   const uint32_t v0 =
      __gen_uint(values->ValidBit, 0, 0) |
      __gen_uint(values->FaultType, 1, 2) |
      __gen_uint(values->SRCIDofFault, 3, 10) |
      __gen_uint(values->GTTSEL, 11, 1) |
   dw[0] = __gen_combine_address(data, &dw[0], values->VirtualAddressofFault, v0);

This change fix the trailing '|'.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 17:23:49 +00:00
Julien Isorce
9df3f28a8b gallium/hud: check NULL return from u_upload_alloc
Fixes the following segmentation fault:

signal SIGSEGV: invalid address (fault address: 0x0)
 frame #0: 0x00007fffe718e117 radeonsi_dri.so hud_draw_background_quad hud_context.c:170
   167
   168 	   assert(hud->bg.num_vertices + 4 <= hud->bg.max_num_vertices);
   169
-> 170 	   vertices[num++] = (float) x1;
   171 	   vertices[num++] = (float) y1;
   172
   173 	   vertices[num++] = (float) x1;
(lldb) bt
  * frame #0: 0x00007fffe718e117 radeonsi_dri.so`hud_draw_background_quad
    frame #1: 0x00007fffe718f458 radeonsi_dri.so`hud_draw
    frame #2: 0x00007fffe712967f radeonsi_dri.so`dri_flush

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 17:20:21 +01:00
Julien Isorce
d08c0930af winsys/radeon: check null return from radeon_cs_create_fence in cs_flush
Follow-up of patch:
"radeon_cs_create_fence: check null return from radeon_winsys_bo_create"

radeon_drm_cs_flush
  radeon_cs_create_fence
    radeon_winsys_bo_create

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 17:19:29 +01:00
Julien Isorce
d09edb0146 winsys/radeon: check null in radeon_cs_create_fence
Fixes the following segmentation fault:

radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c
  -> if (!bo->handle)
(gdb) bt
0  radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c
1  0x00007fffe73575de in radeon_cs_create_fence radeon_drm_cs.c
2  0x00007fffe7358c48 in radeon_drm_cs_flush radeon_drm_cs.c

Signed-off-by: Julien Isorce <jisorce@oblong.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 17:17:30 +01:00
Juan A. Suarez Romero
192de3f051 vulkan/wsi: include builddir for generated headers
wayland-drm-client-protocol.h is generated in builddir, so when
builddir != srcdir the header is not found, and compilation of
wsi_common_wayland.c will fail.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 16:04:20 +01:00
Jason Ekstrand
dd4db84640 anv: Use on-the-fly surface states for dynamic buffer descriptors
We have a performance problem with dynamic buffer descriptors.  Because
we are currently implementing them by pushing an offset into the shader
and adding that offset onto the already existing offset for the UBO/SSBO
operation, all UBO/SSBO operations on dynamic descriptors are indirect.
The back-end compiler implements indirect pull constant loads using what
basically amounts to a texelFetch instruction.  For pull constant loads
with constant offsets, however, we use an oword block read message which
goes through the constant cache and reads a whole cache line at a time.
Because of these two things, direct pull constant loads are much faster
than indirect pull constant loads.  Because all loads from dynamically
bound buffers are indirect, the user takes a substantial performance
penalty when using this "performance" feature.

There are two potential solutions I have seen for this problem.  The
alternate solution is to continue pushing offsets into the shader but
wire things up in the back-end compiler so that we use the oword block
read messages anyway.  The only reason we can do this because we know a
priori that the dynamic offsets are uniform and 16-byte aligned.
Unfortunately, thanks to the 16-byte alignment requirement of the oword
messages, we can't do some general "if the indirect offset is uniform,
use an oword message" sort of thing.

This solution, however, is recommended for a few of reasons:

 1. Surface states are relatively cheap.  We've been using on-the-fly
    surface state setup for some time in GL and it works well.  Also,
    dynamic offsets with on-the-fly surface state should still be
    cheaper than allocating new descriptor sets every time you want to
    change a buffer offset which is really the only requirement of the
    dynamic offsets feature.

 2. This requires substantially less compiler plumbing.  Not only can we
    delete the entire apply_dynamic_offsets pass but we can also avoid
    having to add architecture for passing dynamic offsets to the back-
    end compiler in such a way that it can continue using oword messages.

 3. We get robust buffer access range-checking for free.  Because the
    offset and range are baked into the surface state, we no longer need
    to pass ranges around and do bounds-checking in the shader.

 4. Once we finally get UBO pushing implemented, it will be much easier
    to handle pushing chunks of dynamic descriptors if the compiler
    remains blissfully unaware of dynamic descriptors.

This commit improves performance of The Talos Principle on ULTRA
settings by around 50% and brings it nicely into line with OpenGL
performance.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-13 07:58:00 -07:00
Jason Ekstrand
6b644e571e anv: Stall before fast-clear operations
During initial CCS bring-up, I discovered that you have to do a full CS
stall prior to doing a CCS resolve as well as afterwards.  It appears
that the same is needed for fast-clears as well.  This fixes rendering
corruptions on The Talos Principle on Sky Lake GT4.  The issue hasn't
been demonstrated on any other hardware however, given that this appears
to be a "too many things in the pipe" problem, having it be easier to
reproduce on a system with more EUs makes sense.  The issues with
resolves is demonstrable on a GT3 or GT2 so this is probably also a
problem on all GTs.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-03-13 07:57:03 -07:00
Jason Ekstrand
5e44ef4a76 anv: Accurately advertise dynamic descriptor limits
The number of dynamic descriptors is limited by both the number of
descriptors and the total number of dynamic things.  Because there isn't
a single "maximum dynamic things" limit, we need to divide by two so
that they can create the maximum of both UBOs and SSBOs.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-13 07:57:03 -07:00
Jason Ekstrand
d36b463817 anv: Add a helper for working with VK_WHOLE_SIZE for buffers
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-13 07:57:03 -07:00
Rob Clark
f805593b12 freedreno/ir3: fragz cannot be half precision
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-03-13 10:33:07 -04:00
Rob Clark
b1df639db6 freedreno/ir3: optimize less in glsl
Rely on nir for optimization, to reduce compile times.  Very minimal impact
on shader-db:

  total instructions in shared programs:          104170 -> 104199 (0.03%)
  total dwords in shared programs:                209664 -> 209728 (0.03%)
  total full registers used in shared programs:   7156 -> 7161 (0.07%)
  total half registers used in shader programs:   109 -> 109 (0.00%)
  total const registers used in shared programs:  24222 -> 24224 (0.01%)

                   half       full      const      instr     dwords
      helped          12         107         103         112          98
        hurt          11         104         105         115         102

But shader db runtime dropped from ~29.3s user to ~20.4s user.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-03-13 10:33:07 -04:00
Lionel Landwerlin
3278cd7610 aubinator/genxml: use gzipped files to store embedded genxml
This reduces the size of the aubinator binary from ~1.4Mb to ~700Kb.
With can now drop the checks on xxd in configure.

v2: Fix incorrect makefile dependency (Lionel)

v3: use $(PYTHON2) (Emil)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 13:36:31 +00:00
Lionel Landwerlin
351c951e09 intel: genxml: add script to generate gzipped genxml
v2 (from Dylan):
   Add main function
   Add missing Copyright
   Use print_function

v3: Add actually license (Dylan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 13:36:27 +00:00
Jose Fonseca
b822d9c2b7 util/u_thread.h: Include stdint.h for int64_t definition.
Fixes MinGW build.  Trivial.
2017-03-13 12:23:11 +00:00
Iago Toral Quiroga
e8eeb759b7 intel: fix compiler build
compiler/brw_vec4_gs_visitor.cpp:744:39: error:
‘GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES’ was not declared in this scope
           output_vertex_size_bytes <= GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES);

Fixes: d0d4a5f43b ("i965: split EU defines to brw_eu_defines.h")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 13:09:24 +01:00
Christian König
8dee325752 svga: handle P016 format as well
Fixes: 62cff79378 ("gallium: add P016 format")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100180
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 12:49:41 +01:00
Emil Velikov
b82bd31c54 configure.ac: require pthread-stubs only where available
The project is a thing only for BSD platforms. Or in other words - for
any other platforms building/installing pthread-stubs results only in a
pthread-stub.pc file.

And even where it provides a DSO, there's a fundamental design issue
with it - see the pthread-stubs mailing list for the specifics.

v2: Update comment above the switch statement (Jon Turney).

Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Acked-by: Gary Wong <gtw@gnu.org>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Randy Fishel <randy.fishel@oracle.com>
Cc: Niveditha Rau <niveditha.rau@oracle.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-13 11:30:07 +00:00
Emil Velikov
9aebdb5d08 configure.ac: do not require the i965 driver for ANV
As of last few commits we have the two split, thus we no longer require
the i965 in order to have the ANV driver.

Even though ANV does not link against libdrm nor libdrm_intel, we still
require those as dependencies due to the headers they provide.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Jason Ekstrand
ee8044fd33 intel/vulkan: Get rid of recursive make
v2 [Emil Velikov]
 - Various fixes and initial stab at the Android build.
 - Keep the generation rules/EXTRA_DIST outside the conditional

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Jason Ekstrand
7f9bbcfb7b intel/tools: Use a makefile included from intel/Makefile.am
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Emil Velikov
aa09c9552c intel/compiler: whitespace cleanups
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Emil Velikov
bdc5036464 intel/compiler: link all tests again gtest, even test_eu_compact"
At the moment all the tests but test_eu_compact are actual C++ gtests.
To simplify things, we can move the gtest.la to the common TEST_LIBS.
As we're here, we can rename change the test extension [to .cpp] to
avoid using the confusing dummy.cpp.

Add a nice comment in the makefile for posterity.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Emil Velikov
f282ace678 i965: remove i965_symbols_test reference from .gitignore
The test/binary was removed back in 2012. With that one gone, we can
drop the .gitignore file all together.

Cc: Eric Anholt <eric@anholt.net>
Fixes: c885039442 ("i965: Drop the missing symbols link test.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:35 +00:00
Jason Ekstrand
700bebb958 i965: Move the back-end compiler to src/intel/compiler
Mostly a dummy git mv with a couple of noticable parts:
 - With the earlier header cleanups, nothing in src/intel depends
files from src/mesa/drivers/dri/i965/
 - Both Autoconf and Android builds are addressed. Thanks to Mauro and
Tapani for the fixups in the latter
 - brw_util.[ch] is not really compiler specific, so it's moved to i965.

v2:
 - move brw_eu_defines.h instead of brw_defines.h
 - remove no-longer applicable includes
 - add missing vulkan/ prefix in the Android build (thanks Tapani)

v3:
 - don't list brw_defines.h in src/intel/Makefile.sources (Jason)
 - rebase on top of the oa patches

[Emil Velikov: commit message, various small fixes througout]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
d0d4a5f43b i965: split EU defines to brw_eu_defines.h
Split out the EU defines from the 'generic' ones, as the former are more
compiler oriented.

With a later commit we'll move brw_eu_defines.h alongside the compiler
infra to src/intel/. Pulling all the defines in there seems overzealous.

Some defines are used by both i965 and the i965 compiler. Those are
moved to brw_eu_defines.h, and annotated accordingly. The i965 users
were updated to have the extre include to indicate that.

With future work we might provide a better, split but for now this seems
reasonable.

Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
a72ac98160 util/bitscan: use correct signature for ffs/ffsll
Otherwise we'll get errors such as

error: conflicting types for ‘ffs’
error: conflicting types for ‘ffsll’

We might want to improve the heuristics and provide a definition only
when a native one is missing. We can address that at a later stage.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
fb0832b86d i965: add missing brw_defines.h include in brw_program.c
File is using MI_LOAD_REGISTER_IMM, GEN7_CACHE_MODE_1 and others as
defined in the header.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
2eefb903d5 i965: add missing brw_defines.h include in brw_program.c
File is using the PIPE_CONTROL_* macros as defined in the header.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
1d80407a6a i965: add missing #include <assert.h> in brw_inst.h
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
077078ce77 i965: move brw_define.h ifndef guard to the top
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
8c432645bb i965: remove unused macros from brw_defines.h
The follow three groups are not used by neither the DRI module nor the
compiler.
 BRW_POLYGON_*_FACING
 BRW_POLYGON_FACING_*
 BRW_STATELESS_BUFFER_*

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
7784b3c846 i965: remove unused brw_program.h include
Neither of the changed files requires the brw_program.h include. Since
we're about to move them [to src/intel/compiler] with the next commit
there's no point in having the include.

Let alone the very confusing compiler include directive
[-I${top_srcdir}/src/mesa/drivers/dri/i965/] that one would have to use.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:34 +00:00
Emil Velikov
c54c379b96 i965: remove duplicate declaration of brw_mark_surface_used
Function was made static and moved to another header with earlier
commit.

Fixes: 760c8a1d95 ("i965: Make mark_surface_used a static inline in brw_compiler.h")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Emil Velikov
b69a03e12a i965: remove dead brw_new_shader() declaration
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 194537ebe4 ("mesa/glsl/i965: remove Driver.NewShader()")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Emil Velikov
a032002dc9 i965: remove unused brw_cs.h include
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Jason Ekstrand
e042f5fcbc anv: Stop including brw_context.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Jason Ekstrand
4ec5922afa intel/isl: Stop linking libi965_compiler.la into tests
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Jason Ekstrand
12f348bc98 vulkan/wsi: Generate wayland protocol headers separately from EGL
Previously, we were depending on EGL for generating the headers and
providing the protocol symbols. However, since neither Vulkan driver
actually wants to link against EGL, this is kind of pointless. It also
creates a weird build dependency.

v2 [Jason]
 - Add missing wsi/ prefix, MKDIR_GEN

v3 [Emil Velikov]
 - include BUILT_SOURCES/generation rules outside of conditional

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:33 +00:00
Emil Velikov
1d135e2561 radv/wsi: Don't include wayland headers
Unused and we'll rework the way wayland-drm-client-protocol.h is
generated with later commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-13 11:16:30 +00:00
Jason Ekstrand
4ea9bbe1f6 anv/wsi: Don't include wayland headers
Unused and we'll rework the way wayland-drm-client-protocol.h is
generated with later commit.

v2 [Emil]
 - Also remove wayland-client.h

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:30 +00:00
Emil Velikov
d1042ef1dc configure.ac: provide a fall-back define for WAYLAND_SCANNER
In some cases, we can end up calling WAYLAND_SCANNER even when
there's no binary. Do follow the other's approach set by
AX_PROG_FLEX/BISON and set the variable to :

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:30 +00:00
Emil Velikov
c1b5ed853f wayland: move .gitignore where applicable
Strictly speaking things work as-is, but let's move the file alongside
the artefacts it references. Analogous to all other places in mesa.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-13 11:16:30 +00:00
Christian König
5369b5a91d st/va: add config support for 10bit decoding v2
Advertise 10bpp support if the driver supports decoding to a P016 surface.

v2: Advertise 10bpp for the decoder as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:44 +01:00
Christian König
e9d3e29bb3 st/va: add support for allocating 10bpp surfaces
We support P010 and P016 as targets for 10bpp video decoding.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:41 +01:00
Christian König
e58a1e8f68 st/va: add support for P010 and P016 formats v3
No hardware I know off can actually support P010 natively. But we can easily
support P016 and as long as nobody decodes anything into the lower 6bits it
doesn't make any difference to P010.

v2: allow P0160 for post processing as well
v3: fix post processing once more

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:38 +01:00
Christian König
f1d1deb015 st/va: clear the video surface on allocation
This makes debugging of decoding problems quite a bit easier.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:35 +01:00
Christian König
1ce68af07b st/va: cleanup error handling in vlVaCreateSurfaces2
No need to have that twice.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:32 +01:00
Christian König
88f3451083 radeon/uvd: enable 10bit HEVC decode v2
Just use whatever the state tracker allocated.

v2: fix msb mode

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:29 +01:00
Christian König
3e1e441aa0 radeon/UVD: fix the decoding target pitch calculation
The firmware expects the value in pixel not bytes. Didn't made a difference
so far because we only used 8bpp surfaces.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:25 +01:00
Christian König
cee591a224 vl/video_buffer: add support for P016
Just simply the description of the planes.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:22 +01:00
Christian König
62cff79378 gallium: add P016 format
Same layout as NV12, but 16bit per channel instead of 8.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mark Thompson <sw@jkqxz.net>
2017-03-13 08:51:07 +01:00
Kenneth Graunke
920ab07566 i965: Delete unused last_ring local.
Dead since 071d80bde2, and causing
warnings.
2017-03-12 22:57:46 -07:00
Bas Nieuwenhuizen
7c282b3ca1 radv: Store shaders in VRAM.
Less IFETCH latency on misses. Shader code is write once read many,
so GTT doesn't make much sense anyway.

If it turns out to fragment the CPU visible VRAM too much, we can upload with SDMA.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-13 02:14:29 +01:00
Dave Airlie
e27fdbcb4c radv/ac: move to new image intrinsics.
This hooks up radv to the new image intrinsic builders.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-13 09:44:53 +10:00
Dave Airlie
3b49cee8fa radv: disabled scaled formats for transfers.
These really are only supported for vertex buffers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-13 09:36:49 +10:00
Timothy Arceri
13d69a8519 util/u_queue: make u_queue accessible to cpp
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 09:50:26 +11:00
Timothy Arceri
df1d5fc442 glsl: don't use ralloc for blob creation
There is no need to use ralloc here.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 09:50:19 +11:00
Timothy Arceri
ca76a2ba1b gallium/util: replace pipe_thread_setname() with u_thread_setname()
They do the same thing we just moved the function to be
accessible to all of Mesa.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:04 +11:00
Timothy Arceri
14e6b86952 gallium/util: replace pipe_thread_get_time_nano() with u_thread_get_time_nano()
They do the same thing we just moved the function to be
accessible to all of Mesa.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:04 +11:00
Timothy Arceri
f8cc4c25b8 gallium/util: replace pipe_thread_create() with u_thread_create()
They do the same thing we just moved the function to be
accessible to all of Mesa.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:04 +11:00
Timothy Arceri
b822d9dd67 gallium/util: move u_queue.{c,h} to src/util
This will allow us to use it outside of gallium for things like
compressing shader cache entries.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:03 +11:00
Timothy Arceri
04ec4db8b5 gallium/util: make use of new u_thread.h in u_queue.{c,h}
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:03 +11:00
Timothy Arceri
fbfe887253 util: add u_thread.h
This is a minimal copy of os_thread.h from gallium in order to move
u_queue.{c,h} to this directory.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:03 +11:00
Timothy Arceri
a3b820308b gallium/util: use standard malloc/calloc/free in u_queue.c
This will help us moving the file to the shared src/util dir.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:49:03 +11:00
Timothy Arceri
94a6457724 gallium/util: move u_string.h to src/util/u_string.h
This will help us move u_queue.c here eventually and also provide
string function wrappers for anyone wishing to port disk_cache.c
to windows.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:43:06 +11:00
Timothy Arceri
d55d1e9805 gallium/util: remove unused header from u_string.h
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:43:06 +11:00
Timothy Arceri
ff8aad66bd gallium/util: remove unused util_strbuf*
Looks like they have been unused since 2008 b8a7eef242.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:43:06 +11:00
Timothy Arceri
b4b1dcb2c1 gallium/util: remove unused util_memmove()
This is not used anywhere and Visual Studio looks to have
supported memmove() for a long time if not always.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:43:06 +11:00
Timothy Arceri
b607aad8e1 glsl: don't recompile a shader on fallback unless needed
Because we optimistically skip compiling shaders if we have seen them
before we may need to compile them later at link time if they haven't
yet been use in a specific combination to create a program.

Rather than always recompiling we take advantage of the
gl_compile_status enum introduced in the previous patch to only
compile when we have previously skipped compilation.

This helps with regressions in app start-up times on cold cache
runs, compared with no cache.

Deus Ex: Mankind Divided start-up times:

cache disabled:               ~3m15s
cold cache master:            ~4m23s
cold cache with this patch:   ~3m33s

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:26:08 +11:00
Timothy Arceri
bfa95997c4 mesa/glsl: introduce new gl_compile_status enum
This will allow us to tell if a shader really has been compiled or
if the shader cache has just seen it before.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:24:40 +11:00
Matt Turner
3d253d330a i965: Initialize compaction tables in unit test.
Fixes: fa4b792e83 "i965: Move brw_init_compaction_tables() to brw_create_compiler()."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100154
2017-03-10 23:16:39 -08:00
Matt Turner
fa4b792e83 i965: Move brw_init_compaction_tables() to brw_create_compiler().
... so that we can avoid threading complications or unnecessary
compaction table initializations (which just consists of setting some
pointers based on devinfo->gen).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-10 17:58:11 -08:00
Emil Velikov
32be87852b bin/get-fixes-pick-list.sh: do not mandate bash
Silly thinko on my end, as I was writing the script. There is nothing
bash specific in there.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:49 +00:00
Emil Velikov
0e94217999 bin/shortlog_mesa.sh: remove the final bashism
Remove the typeset built-in and toggle to /bin/sh

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
3aa5f51c27 bin/bugzilla_mesa.sh: rework the looping method
We don't use DRYRUN (and no others scripts have one) so just drop it.

This allows us to rework the loop to the more commonly used "git .... |
while read foo; do ... done"

That in itself gets rid of the only remaining bashism and we can toggle
the shebang to /bin/sh.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
1c3a1d74ec wayland-egl/wayland-egl-symbols-check: do not mandate bash
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
f7e7708d75 gbm/gbm-symbols-check: do not mandate bash
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
5a0e4f4837 egl/egl-symbols-check: do not mandate bash
There's nothing bash specific in the script.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
a3782f2b7a glsl/tests: remove any bashisms
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
05c1d6d564 dri: use correct shebang for gen-symbol-redefs.py
This is a python2 script and the generic "python" may point to python3.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
fb187d2232 util: remove shebang from format_srgb.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
2e8c683f5e xmlpool: remove shebang from gen_xmlpool.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
6d9ad29451 genxml: remove shebang from gen_pack_header.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00
Emil Velikov
e4c7911150 nir: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
a497f44645 st/xa: suffix xa-indent{,.sh} and add a shebang line
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
c79c54ae34 gallium/tools: use correct shebang for python scripts
These are python2 scripts and the generic "python" may point to
python3.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
e7b01d9fc8 gallium/tools: do not hardcode bash location
It is not guaranteed to be in /bin

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
6f341b9dfd gallium/tests: remove execute bit from TGSI shader - vert-uadd.sh
Just like the the dozens of other shaders, the file is parsed by
separate tool and not executed.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
68c38b2431 mapi/gen: remove shebang from python scripts
All of those should be executed $PYTHON2/python2 [or equivalent] hence
why they are missing the execute bit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
9a502f5c47 mapi: do not mandate bash for es*api/ABI-check
Seemingly there is nothing bash specific in these. The Debian
checkbashisms does not spot neither run in zsh.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
d73603fcdd bin/perf-annotate-jit: add .py suffix
To provide direct feedback about the file in question.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
f03e7af7b9 i965: remove shebang from brw_nir_trig_workarounds.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
1a39f3187c i965: remove execute bit from brw_nir_trig_workarounds.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Emil Velikov
be4ce4937e mesa: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
d2af6f6ee0 mesa: remove execute bit from main/format_parser.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
a1d186cb70 amd: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
f6180a5ab7 amd: remove execute bit from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
168d801149 gallium: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
2ea1ce2701 gallium: remove execute bit from the python script(s)
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
5d15fe446d svga: remove shebang from svgadump/svga_dump.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
bd9bb86bc3 svga: remove execute bit from svga_dump.py
The file is used to generate svgadump/svga_dump.c... in theory at least.
Atm. the file is checked in-tree but that is about to change later
commits.

As we get to that we'll use $PYTHON2 or equivalent as used throughout
the tree.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
cc9533c53f freedreno: remove shebang from ir3_nir_trig.py
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
55ffbbf571 freedreno: remove execute bit from ir3_nir_trig.py
The file is meant to be called with $(PYTHON2) and not executed
directly.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
56e58e01e4 glsl: remove shebang from python scripts
All of the scripts are [must be] executed via $PYTHON2 [or equivalent]
hence why they are missing the execute bit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:46 +00:00
Emil Velikov
eca18d440d glsl/tests: remove execute bit from compare_ir python script
Nearly all the python scripts used in-tree are invoked via $PYTHON2 or
equivalent. As such having the execute bit not needed and generally
ill-advised.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Emil Velikov
7473fcd40b glsl/tests: suffix .sh/.py files as applicable
This makes it easier/clearer as to:
 - if the file should have the execute bit set (.py should not)
 - do we need the shebang in the first place and if so what it should be

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Emil Velikov
32d153c428 mesa: drop the execute bit from gl.xml
This is a spec file which is parsed by scripts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Emil Velikov
45a37c98e7 mapi/glapi: remove unused next_available_offset.sh
Afaict there was no [documented] users since it was introduced.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:45 +00:00
Ben Widawsky
2ee34bd5dc gbm: Export a per plane getter for offset
Unlike stride, there was no previous offset getter, so it can be right
on the first try.

v2: Return EINVAL when plane is greater than total planes to make it
match the similar APIs.
Avoid leak after fromPlanar (Daniel)
Make sure when getting offsets we consider dumb images (Daniel)

v3: Use Jason's recommendation for handling the non-planar case.

v4: Return int64_t so we can get real errors

v5: Add an assertion for dumb BOs (Jason)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Ben Widawsky
7f6209e46f gbm: Export a per plane getter for stride
v2: Preserve legacy behavior when plane is 0 (Jason Ekstrand)
EINVAL when input plane is greater than total planes (Jason Ekstrand)
Don't leak the image after fromPlanar (Daniel)
Move bo->image check below plane count preventing bad index succeeding (Daniel)

v3: Fix DRIimage leak (using Jason's recommended change)
Make plane 0 return planar stride. This might break legacy behavior (Jason)

v4: Move bogus hunk for get_handle_for_plane to the right patch (Jason)
Fix error handling path to be cleaner (Jason)

v5: Add assert for dumb BOs to make sure plane == 0 (Jason)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Ben Widawsky
ed4cf2440d gbm: Create a gbm_device getter for stride
This will be used so we can query information per plane.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Ben Widawsky
f9567ab435 gbm: Export a getter for per plane handles
v2: Make the error return be -1 instead of 0 because I think 0 is
actually valid.

v3: Set errno to EINVAL when the specified plane is above the total
planes. (Jason Ekstrand)
Return the bo's handle if there is no image ie. for dumb images like cursor (Daniel)

v4:
- Add assertions about plane == 0 (Jason)
- Add a comment about new restriction on planar dumb bo which is not an
earlier patch in the series.
- Correctly refactor from v2 in this patch; it ended up rebased into the
wrong patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Ben Widawsky
42eacddfc0 gbm: Export a plane getter function
This will be used by clients that need to know the number of planes
allocated for them on behalf of the GL or other API. The best current
example of this is when an extra "plane" is allocated to store
compression data for the primary plane.

v2: Return 1 for cases where there is no image, ie. dumb bo (Daniel)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Ben Widawsky
770b06588f gbm: Explicitly disallow a planar dumb BO
As more GBM functionality support planes is being evaluated, it becomes
clear that a dumb bo can never actually be planar. It's questionable
whether it was ever feasible to do this, and later functionality will
implicitly assume a dumb BO is non-planar.

v2: Include stdbool.h

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-03-09 15:35:44 -08:00
Anuj Phogat
29e2ba0756 i965: Rename brw_format_for_mesa_format() to brw_isl_format_for_mesa_format()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-09 09:47:30 -08:00
Robert Bragg
a678b79ef4 i965: Add more Haswell OA metrics sets
This extends the brw_oa_hsw.xml to expose these additional queries:

- Compute Metrics Basic Gen7.5
- Compute Metrics Extended Gen7.5
- Memory Reads Distribution Gen7.5
- Memory Writes Distribution Gen7.5
- Metric set Sampler Balance

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:51 +00:00
Robert Bragg
458468c136 i965: Expose OA counters via INTEL_performance_query
This adds support for exposing basic Observation Architecture
performance counters on Haswell.

This support is based on the i915 perf kernel interface which is used
to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT
commands around queries to collect counter snapshots.

To take into account the small chance that some of the 32bit counters
could wrap around for long queries (~50 milliseconds for a GT3 Haswell @
1.1GHz) the implementation also collects periodic metrics.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:50 +00:00
Robert Bragg
a98ffe2477 exec_list: Add a foreach_list_typed_from macro
This allows iterating list nodes from a given start point instead of
necessarily the list head.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:50 +00:00
Robert Bragg
e56550565e i965: Add script to gen code for OA counter queries
Avoiding lots of error prone boilerplate and easing our ability to add +
maintain support for multiple OA performance counter queries for each
generation:

This adds a python script to generate code for building up
performance_queries from the metric sets and counters described in
brw_oa_hsw.xml as well as functions to normalize each counter based on
the RPN expressions given.

Although the XML file currently only includes a single metric set, the
code generated assumes there could be many sets.

The metrics as described in XML get translated into C structures
which are registered in a brw->perfquery.oa_metrics_table hash table
keyed by the GUID of the metric set in XML.

v2: numerous python style improvements (Dylan)
v3: Makefile.am fixups (Emil)
v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:44 +00:00
Robert Bragg
f46e58e018 i965: extend query/counter structs for OA queries
In preparation for generating code from brw_oa_hsw.xml for describing OA
performance counter queries this adds some OA specific members to
brw_perf_query that our generated code will initialize:

- The oa_metric_set_id is the ID we will pass to
  DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under:
  /sys/class/drm/<card>/metrics/<guid/id

- The oa_format is the OA report layout we will request from the kernel

- The accumulator offsets determine where the different groups of A, B
  and C counters are located within an intermediate 64bit 'accumulator'
  buffer.

Additionally brw_perf_query_counter now has 64bit or float _read()
callback members for OA counters.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Robert Bragg
eaab41c9db i965: brw_context.h additions for OA unit query codegen
In preparation for generating code from the XML performance counter meta
data, this makes some additions to brw_context.h for this code to be
able to reference.

It adds a brw->perfquery.oa_metrics_table hash table for indexing built
up query descriptions by the GUID that is expected to be advertised by
the kernel (via sysfs) to be able to use that query.

It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built
up by generated code.

It adds a brw->perfquery.sys_vars structure to have a consistent place
to represent the different system variables like $EuCoresTotalCount and
$EuSlicesTotalCount that are referenced by OA counter normalization
equations.

  Although extending + referencing gen_device_info for these variables
  was considered, these are some of the (mostly minor) reasons for
  going with a dedicated structure:

  - Currently we only need this info for the performance_query backend
    and it might be a bit tedious to go back and initialize the state
    for pre-Haswell devinfo structures.
  - Considering the $SubsliceMask then the requirement for how multiple
    per-slice masks are packed only comes from how the variables are
    references by availability tests in XML, and might not be a good
    general representation for tracking subslice masks if another use
    case arises.
  - If we used gen_device_info then we'd likely want to avoid making
    assumptions about the C types during codegen and adding explicit
    casts, while that's not necessary with a dedicated struct with all
    members being uint64_t.
  - This structure and the code for initializing it is currently shared
    (just through copy & paste) with a few other projects dealing with
    OA counters, and that's been convenient so far.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Robert Bragg
b79268174b i965: XML description of Haswell OA metric set
In preparation for exposing Gen Observation Architecture performance
counters via INTEL_performance_query this adds an XML description for an
initial 'Render Metrics Basic Gen7.5' query and corresponding counters.

The intention is to auto generate code for building a query from these
counters as well as the code for normalizing the individual counters.

Note that the upstream for this XML data is currently GPU Top:

  https://github.com/rib/gputop

The files are maintained under gputop-data/ and they are themselves
derived from files in an internal 'MDAPI XML' schema. There are scripts
under gputop-scripts/ and make rules in gputop-data/Makefile.xml for
maintaining these files.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Pierre Moreau
655c395f65 nv50/ir: check for origin insn in findOriginForTestWithZero
Function arguments do not have an "origin" instruction, causing a
NULL-pointer dereference without this check.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-09 12:42:46 +01:00
Samuel Pitoiset
d54b498694 mesa/main: make use of lookup_samplerobj_locked()
There is no need to check sampler == 0 twice. This removes now
unused _mesa_lookup_samplerobj_locked().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09 11:01:37 +01:00
Samuel Pitoiset
58b4ae0411 mesa/main: inline {begin,end}_samplerobj_lookups()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09 11:01:31 +01:00
Grazvydas Ignotas
8cd83a6c81 glsl/blob: clear padding bytes
Since blob is intended for serializing data, it's not a good idea to
leave padding holes with uninitialized data, which may leak heap
contents and hurt compression if the blob is later compressed, like
done by shader cache. Clear it.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:41:02 +11:00
Grazvydas Ignotas
61bbb25a08 util/disk_cache: fix size subtraction on 32bit
Negating size_t on 32bit produces a 32bit result. This was effectively
adding values close to UINT_MAX to the cache size (the files are usually
small) instead of intended subtraction.
Fixes 'make check' disk_cache failures on 32bit.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:26:30 +11:00
Grazvydas Ignotas
926bcacfd3 util/disk_cache: fix compressed size calculation
It incorrectly doubles the size on each iteration.

Fixes: 85a9b1b5 "util/disk_cache: compress individual cache entries"

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:26:23 +11:00
Lionel Landwerlin
f81ede4699 glsl: builtin: always return clones of the builtins
Builtins are created once and allocated using their own private ralloc
context. When reparenting IR that includes builtins, we might be steal
bits of builtins. This is problematic because these builtins might now
be freed when the shader that includes then last is disposed. This
might also lead to inconsistent ralloc trees/lists if shaders are
created on multiple threads.

Rather than including builtins directly into a shader's IR, we should
include clones of them in the ralloc context of the shader that
requires them. This fixes double free issues we've been seeing when
running shader-db on a big multicore (72 threads) server.

v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better
    reflect how this function is used. (Ken)

v3: Rename ctx to mem_ctx (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 08:30:36 +00:00
Kenneth Graunke
071d80bde2 i965: Delete render ring prelude.
This was a hook I came up when trying to do the initial performance
counter work years ago.  Nothing's used it for a long time, and the
upcoming performance counter support doesn't want it either.

So, goodbye render ring prelude.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-08 23:01:21 -08:00
Vinson Lee
d64ded7b50 swr: s/uint/enum pipe_render_cond_flag/
Fix build error.

swr_context.cpp: In function ‘void swr_blit(pipe_context*, const pipe_blit_info*)’:
swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive]
                                       ctx->render_cond_mode);
                                       ~~~~~^~~~~~~~~~~~~~~~

Fixes: b0d3938430 ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-08 21:43:07 -08:00
Bas Nieuwenhuizen
7d6e1a341a radv: Don't flush the CB before doing a fast clear eliminate.
The only way we write CMASK/DCC compressed textures through shaders
is fast clears and CMASK/DCC inits, which have their own flushes.
Hence the CB cache is always up to date.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:28 +01:00
Bas Nieuwenhuizen
8700329785 radv: Don't emit cache flushes on subpass switch.
I think we should only flush right before an action (draw/dispatch etc.),
as otherwise it is too easy to issue redundant flushes.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:23 +01:00
Bas Nieuwenhuizen
9251f8b35e radv: Only flush for the needed stages, and before the flushes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:19 +01:00
Bas Nieuwenhuizen
f92a118434 radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB.
Without stores, the only writes are fast clears, transfers and metadata
initialization, each of which have the appropiate invalidations already.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:14 +01:00
Bas Nieuwenhuizen
0567ab0407 radv: Flush more caches after writes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:10 +01:00
Bas Nieuwenhuizen
7a600bbc81 radv: Don't flush for fixed-function reading.
The data should always be in memory after a src flush.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:05 +01:00
Bas Nieuwenhuizen
dd094e4ff9 radv: Invalidate the correct caches for CB/DB dst barriers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:01 +01:00
Bas Nieuwenhuizen
b075eb7d47 radv: Determine cache flushes per object.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:34:42 +01:00
Samuel Pitoiset
2568d9d0cd mesa/main: remove unused _mesa_new_texture_image()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-03-09 01:57:20 +01:00
Dave Airlie
e6902be900 radv/ac: fixup texture coord to have right number of channels.
Jason has patches to add validation to this area, this should fix
radv shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-09 09:17:11 +10:00
Timothy Arceri
0e34966340 st/nine: pass NULL to ureg_get_tokens()
The number of tokens in never used and the pointer is NULL checked
so just pass NULL.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-03-09 09:29:07 +11:00
Matt Turner
a45cd8107d docs: ARB_shader_atomic_counter_ops is enabled on i965/gen7+.
This extension was enabled in commit 40dd45d0c6 ("i965: Enable
ARB_shader_atomic_counter_ops") but the commit failed to update the
release notes or features.txt. The release notes ship has sailed, since
the commit was in 13.0.
2017-03-08 13:58:52 -08:00
Eric Anholt
19f571ba6d vc4: Fix math with a condition flag set.
Math results land in r4, regardless of the condition.  To implement them,
we just need to ensure that the results are moved out of r4 (as often
happens anyway, the values is live across another math instruction), so
that we can attach the condition to the MOV.

Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a
couple others, that were assertion failing that their conditions hadn't
been handled during the QIR->QPU stage.
2017-03-08 13:44:17 -08:00
Eric Anholt
615f6653b0 vc4: Fix register pressure cost estimates when a src appears twice.
This ended up confusing the scheduler for things like fabs (implemented as
fmaxabs x, x) or squaring a number, and it would try to avoid scheduling
them because it appeared more expensive than other instructions.

Fixes failure to register allocate in
dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db
effects (+.35% max temps)
2017-03-08 13:44:17 -08:00
Eric Anholt
0fca01d027 vc4: Report to shader-db how many threads a fragment shader has.
Doing instruction count analysis when we emit the thread switches that
will save us from tons of stalls is kind of missing the point.
2017-03-08 13:44:17 -08:00
Eric Anholt
61359324c1 Revert "vc4: Lazily emit our FS/VS input loads."
This reverts commit 292c24ddac.  It broke a
lot of GLES2 deqp, and I see at least one problem that will require some
serious rework to fix.
2017-03-08 13:44:17 -08:00
Marek Olšák
ab12a126fd radeonsi: fix elimination of literal VS outputs
broken when switched to the new intrinsics.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-08 19:56:36 +01:00
Fabio Estevam
78c5772633 loader: Move non-error message to debug level
Currently when running mesa on imx6 the following loader warnings
are seen:

# kmscube -D /dev/dri/card1
MESA-LOADER: device is not located on the PCI bus
MESA-LOADER: device is not located on the PCI bus
MESA-LOADER: device is not located on the PCI bus
Using display 0x1920948 with EGL version 1.4

As this is not an error message, change it to debug level in
order to have a cleaner log output.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:35:00 +00:00
Mauro Rossi
61c38d14b7 android: r600: fix libmesa_amd_common dependency
Adding libmesa_amd_common dependency and exporting its headers,
avoids the following building error:

external/mesa/src/gallium/drivers/r600/evergreen_compute.c:29:10: fatal error: 'ac_binary.h' file not found
         ^
1 error generated.

Fixes: 3bbbb63 "automake: r600: radeonsi: correctly manage libamd_common.la linking"
Fixes: 503fb13 "radeon/ac: switch to ac_shader_binary_config_start()"

v2 [Emil Velikov: drop unneeded LOCAL_EXPORT_C_INCLUDE_DIRS]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:27:23 +00:00
Emil Velikov
1fe4d638a1 gallium/targets: rework the empty targets removal
Earlier commit added extra tracking and we've attempted to remove the
vdpau/other folder if empty. V2 of said commit dropped the pipe
to /dev/null and the explicit "true" override.

Sadly both of those are needed since there's no guarantee that the
folder will be empty before we [mesa] make install.

Since we're bringing those two back, there's no need to track if we've
installed anything, and simply do "rm -d foo/ &>/dev/null || true"

Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Fixes: 1cd4fde053 ("gallium/targets: don't leave an empty target directory(ies)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:23:07 +00:00
Brian Paul
2f3f5728f7 util/indices: minor clean-ups
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:21 -07:00
Brian Paul
a0927da006 radeonsi: s/uint/enum pipe_shader_type/
This can probably be done in more places in the driver.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
b0d3938430 gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
2b9ab605aa gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
73bafb5ee3 gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
1564a768ae virgl: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
6614b060fb swr: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
f676c700cc softpipe: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
0fc5110a6e llvmpipe: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
4aec68176d freedreno: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
7532ed106f etnaviv: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
b4191b712b draw: s/unsigned/enum pipe_shader_type/
and some s/uint/enum pipe_shader_type/

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
ed66c9d7b8 cso: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
637e5719b5 gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Tapani Pälli
db5f9c3177 anv: change BLOCK_POOL_MEMFD_SIZE to exactly 2GB
This is what comment above definition says and change fixes issue with
32bit build where BLOCK_POOL_MEMFD_SIZE is used as ftruncate parameter
and constant currently gets converted from 4294967296 to 0.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-08 07:57:55 +02:00
Matt Turner
58b69eedd3 Revert "configure.ac: Use PKG_CHECK_VAR for wayland-scanner."
This reverts commit 8a26e94439.
2017-03-07 21:24:05 -08:00
Matt Turner
0b361f9d35 Revert "configure.ac: Use PKG_CHECK_VAR for libclc."
This reverts commit 706074cc96.
2017-03-07 21:24:05 -08:00
Chris Wilson
05520ba490 i965: Remove use of deprecated drm_intel_aub routines
With mesa/drm commit cd2f91e18db087edf93fed828e568ee53b887860
Author: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>
Date:   Fri Jul 31 10:47:50 2015 -0700

    intel: Drop aub dumping functionality

the drm_intel_aub routines are mere stubs and do nothing. Likewise
remove our invocations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-07 16:40:03 -08:00
Jason Ekstrand
4483c5d57c spirv: Silence unused variable warnings in release mode
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
0421813588 anv: Make the framebuffer-renderpass format assert non-fatal
This should let Dota 2 run on debug builds though it will spew errors
like mad.  Hopefully, Valve will get this fixed sooner rather than
later.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
33301d949f anv: Drop the anv_validate block helper
Over the course of driver development, we've come up with a number of
different schemes for adding giant blocks of asserts inside the driver.
This one is only being used once in anv_pipeline.c and the way it's
being used actually generates compiler warnings in release builds.  This
commit drops the anv_validate macro and just puts the contents of the
one validation function in side of a "#ifdef DEBUG" guard.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
a316d8f406 anv: Get rid of the stub() macros
Except for a few unimplemented things on gen7, we don't really have
stubs anymore so we should drop this.  This commit replaces the few gen7
stub() calls with explicitly labeled finishme's and makes the sparse
binding stuff silently no-op or return a FEATURE_NOT_PRESENT error.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
1488d079cb anv: Remove a pointless finishme
We've been supporting multiple shaders per module for some time now.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
1a43792783 anv: Convert the HiZ finishme's to perf_warn
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Jason Ekstrand
201fc83df7 anv: Add a performance warning helper
This acts identically to anv_finishme except that it only dumps out
these nice log messages if you run with INTEL_DEBUG=perf.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-07 15:22:16 -08:00
Timothy Arceri
20234cfe3a st/mesa: don't propagate uniforms when restoring from cache
We will have already loaded the uniforms when the parameter list
was restored from cache.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-08 09:45:48 +11:00
Damien Grassart
e25c92a72d radv: remove duplicate initialization of alphaToOne feature
Fixes a GCC warning when compiling with -Wextra:
radv_device.c:463:47: warning: initialized field overwritten [-Woverride-init]

Signed-off-by: Damien Grassart <damien@grassart.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-08 06:00:34 +10:00
Dave Airlie
d81bd2f754 radv: disable mip point pre clamping.
No idea what this does, but disabling it fixes a bunch
of failing CTS tests in the lod area, so let's go with that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-08 05:50:46 +10:00
Fredrik Höglund
162beb2abb radv/ac: fix multiple descriptor sets with dynamic buffers
The dynamic_offset_offset in the descriptor set binding layout is
relative to the dynamic_offset_start for the set in the pipeline
layout.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-07 20:23:32 +01:00
Fredrik Höglund
71bb1a9c3c radv: fix the size of the dynamic_buffers array
A buffer descriptor is 16 bytes, not 16 dwords.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-07 20:23:26 +01:00
Fredrik Höglund
0941d1a574 radv: fix the dynamic buffer index in vkCmdBindDescriptorSets
This fixes the wrong dynamic buffer descriptors being updated when
firstSet > 0.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-07 20:23:04 +01:00
Matt Turner
69063d0561 configure.ac: Ensure libomxil-bellagio exists before invoking pkg-config.
I was already tired of seeing the message

    Package libomxil-bellagio was not found in the pkg-config search path.
    Perhaps you should add the directory containing `libomxil-bellagio.pc'
    to the PKG_CONFIG_PATH environment variable
    No package 'libomxil-bellagio' found

on every configure, but I just got a distro bug reported where the user
was confused by this message and thought it indicated a bug.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-07 07:27:45 -08:00
Matt Turner
86c023f973 configure.ac: Ensure libva is enabled before invoking pkg-config.
PKG_CHECK_VAR can only check --variable=$NAME, so it cannot handle
modversion.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-07 07:27:45 -08:00
Matt Turner
706074cc96 configure.ac: Use PKG_CHECK_VAR for libclc.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-07 07:27:45 -08:00
Matt Turner
8a26e94439 configure.ac: Use PKG_CHECK_VAR for wayland-scanner.
Available since pkg-config-0.28 and pkgconf-0.8.10.

The removal of the AC_PATH_PROG is intentional. Use pkg-config.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-07 07:27:45 -08:00
Matt Turner
f73903f09b configure.ac: Fix error message in radeon_llvm_check().
It printed the version of LLVM ($1):

   configure: error: 3.6.0 requires libelf when using llvm

instead of the driver name ($2):

   configure: error: r600 requires libelf when using llvm

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-03-07 07:27:45 -08:00
Matt Turner
e457e6abec build: Replace NEED_RADEON_LLVM with HAVE_GALLIUM_LLVM.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-07 07:27:45 -08:00
Bas Nieuwenhuizen
6424795f52 radv: Use the subresource range in HTILE initialization.
v2: fix levelCount assert.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-07 09:58:33 +01:00
Bas Nieuwenhuizen
3b455c1cb7 radv: Use winsys HTILE info.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-07 09:58:27 +01:00
Bas Nieuwenhuizen
dbecbab5aa radv/amdgpu: Let addrlib calculate the HTILE parameters.
Still not sure we can support miptrees when sampling from
HTILE enabled textures.

Added the tcCompatible winsys stuff while I'm at it.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-07 09:58:21 +01:00
Dave Airlie
03f5405fc2 amd/common: document PREDICATION OP 3 as 64-bit bool.
This just documents some info for possible future use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 15:20:01 +10:00
Dave Airlie
b26249781e radv: handle z offset for 3d image <-> buffer copies.
This fixes:
dEQP-VK.pipeline.render_to_image.3d.huge.depth.r8g8b8a8_unorm

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 04:02:00 +00:00
Dave Airlie
c5947e9787 radv: move fast clear before resolve into own loop.
Don't fast clear inside the meta loop as things get
confused, fixes a crash in:
dEQP-VK.api.copy_and_blit.resolve_image.whole_array_image.2_bit

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 04:01:53 +00:00
Bas Nieuwenhuizen
0ab2dd361f radv: Disable HTILE for textures with multiple layers/levels.
It has issues and the fix I'm working on is too complicated for stable,
so disable for now.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-03-06 23:58:57 +01:00
Dave Airlie
6bae1e44a9 radv: Properly handle destroying NULL devices and instances
Ported from anv:
3d33a23e anv: Properly handle destroying NULL devices and instances

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 08:17:03 +10:00
Dave Airlie
5c45d2051a radv/ac: introduce i1true/i1false to context.
This uses these in a few places, and fixes one or two
cases which were using da as 32-bit instead of bool.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 08:17:03 +10:00
Dave Airlie
ca884aef86 radv/ac: handle Z export using new builder.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 08:17:03 +10:00
Dave Airlie
bf2be50774 radv/ac: move to using common ac_get_image_intr_name.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 08:17:03 +10:00
Dave Airlie
10ae83a9c2 radeonsi/ac: move get_image_intr_name to common
This code is used in radv, so move to common build code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-07 08:17:03 +10:00
Timothy Arceri
7eb85b8204 gallium/util: remove unused header from u_queue.c
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 09:12:16 +11:00
Timothy Arceri
60a2c2507d gallium/util: remove unused pipe_thread_destroy()
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 09:12:16 +11:00
Timothy Arceri
d82d8be614 gallium/util: replace pipe_thread_wait() with thrd_join()
Replace done using:
find ./src -type f -exec sed -i -- \
's:pipe_thread_wait(\([^)]*\)):thrd_join(\1, NULL):g' {} \;

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 09:12:16 +11:00
Timothy Arceri
da40ac65c7 gallium/util: remove PIPE_THREAD_ROUTINE()
This was made unnecessary with fd33a6bcd7.

This was mostly done with:
find ./src -type f -exec sed -i -- \
's:PIPE_THREAD_ROUTINE(\([^,]*\), \([^)]*\)):int\n\1(void \*\2):g' {} \;

With some small manual tidy ups.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 09:12:16 +11:00
Timothy Arceri
e92293a601 gallium/util: replace pipe_condvar with cnd_t
pipe_condvar was made unnecessary with fd33a6bcd7.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 09:07:33 +11:00
Timothy Arceri
e5375ba028 gallium/util: replace pipe_thread with thrd_t
pipe_thread was made unnecessary with fd33a6bcd7.

V2: fix compile error in u_queue.c

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:53:27 +11:00
Timothy Arceri
628e84a58f gallium/util: replace pipe_mutex_unlock() with mtx_unlock()
pipe_mutex_unlock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_unlock(\([^)]*\)):mtx_unlock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:53:05 +11:00
Timothy Arceri
ba72554f3e gallium/util: replace pipe_mutex_lock() with mtx_lock()
replace pipe_mutex_lock() was made unnecessary with fd33a6bcd7.

Replaced using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_lock(\([^)]*\)):mtx_lock(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:38 +11:00
Timothy Arceri
be188289e1 gallium/util: replace pipe_mutex_destroy() with mtx_destroy()
pipe_mutex_destroy() was made unnecessary with fd33a6bcd7.

Replace was done with:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_destroy(\([^)]*\)):mtx_destroy(\&\1):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:16 +11:00
Timothy Arceri
75b47dda0c gallium/util: replace pipe_mutex_init() with mtx_init()
pipe_mutex_init() was made unnecessary with fd33a6bcd7.

Replace was done using:
find ./src -type f -exec sed -i -- \
's:pipe_mutex_init(\([^)]*\)):(void) mtx_init(\&\1, mtx_plain):g' {} \;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:52:07 +11:00
Timothy Arceri
acdcaf9be4 gallium/util: remove pipe_static_mutex()
This was made unnecessary with fd33a6bcd7.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:48:16 +11:00
Timothy Arceri
2efddc63ee gallium/util: replace pipe_mutex with mtx_t
pipe_mutex was made unnecessary with fd33a6bcd7.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:48:11 +11:00
Timothy Arceri
464d4806c1 gallium/util: replace pipe_condvar_broadcast() with cnd_broadcast()
pipe_condvar_broadcast() was made unnecessary with fd33a6bcd7.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:23:26 +11:00
Timothy Arceri
5e56c2c79d gallium/util: replace pipe_condvar_signal() with cnd_signal()
pipe_condvar_signal() was made unnecessary with fd33a6bcd7.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:23:26 +11:00
Timothy Arceri
74c879ac75 gallium/util: replace pipe_condvar_wait() with cnd_wait()
pipe_condvar_wait() was made unnecessary with fd33a6bcd7.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:23:26 +11:00
Timothy Arceri
1e0314281a gallium/util: replace pipe_condvar_destroy() with cnd_destroy()
pipe_condvar_destroy() was made unnecessary with fd33a6bcd7.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:23:26 +11:00
Timothy Arceri
3f58242863 gallium/util: replace pipe_condvar_init() with cnd_init()
pipe_condvar_init() was made unnecessary with fd33a6bcd7.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-07 08:23:26 +11:00
Marek Olšák
63d7a12fad st/dri: reduce dri_fill_st_options() params
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-03-07 08:16:46 +11:00
Marek Olšák
696c5115b9 st/dri: use local pointer to st_context_iface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-03-07 08:16:39 +11:00
Gregory Hainaut
2ab5eccf5d glapi: fix typo in count_scale
2*4=8

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-07 08:11:40 +11:00
Kenneth Graunke
7782936cbc i965: Return NULL from initScreen2, not false.
This returns a pointer, not a boolean.  No actual effect, but cleaner.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-06 12:38:15 -08:00
Kenneth Graunke
b5b123ac8f i965: Make a devinfo local variable.
screen->devinfo.gen is annoying to type and linewrap.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-06 12:38:15 -08:00
Kenneth Graunke
951f56cd43 i965: Delete vestiges of resource streamer code.
We never actually used the resource streamer in any shipping build
of Mesa.  We have no plans to do so in the future.  We looked into
using it in Vulkan, and concluded that it was unusable.  We're not
the only ones to arrive at the conclusion that it's not worth using.

So, drop the last vestiges of resource streamer support and move on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-06 12:38:15 -08:00
Kenneth Graunke
4dc785728a i965: Drop duplicate #defines now that we've bumped libdrm requirements.
We've updated our libdrm requirement, and it will already provide these.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-06 12:38:15 -08:00
Samuel Pitoiset
4317cd96d3 getteximage: fix _mesa_GetTextureSubImage()
Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100088
Fixes: 5ae54c0cf7 ("getteximage: avoid to lookup textures with id 0")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-06 21:36:56 +01:00
Grazvydas Ignotas
ff494fe999 ralloc: don't leave out the alignment factor
Experimentation shows that without alignment factor gcc and clang choose
a factor of 16 even on IA-32, which doesn't match what malloc() uses (8).
The problem is it makes gcc assume the pointer is 16 byte aligned, so
with -O3 it starts using aligned SSE instructions that later fault,
so always specify a suitable alignment factor.

Cc: Jonas Pfeil <pfeiljonas@gmx.de>
Fixes: cd2b55e5 "ralloc: Make sure ralloc() allocations match malloc()'s alignment."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100049
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Tested by: Mike Lothian <mike@fireburn.co.uk>
Tested by: Jonas Pfeil <pfeiljonas@gmx.de>
2017-03-06 11:28:48 -08:00
Grazvydas Ignotas
b384c23b9e i965: don't require 64bit cmpxchg
There are still some distributions trying to support unfortunate people
with old or exotic CPUs that don't have 64bit atomic operations. The
only thing preventing compile of the Intel driver for them seems to be
initialization of a debug variable.

v2: use call_once() instead of unsafe code, as suggested by Matt Turner

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-06 11:07:20 -08:00
Alex Smith
290d7e892d radv: Emit pending flushes before executing a secondary command buffer
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.

This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-03-06 19:46:14 +01:00
Samuel Pitoiset
052c81faa1 mesa/main: remove useless check in _mesa_IsSampler()
_mesa_lookup_samplerobj() returns NULL if sampler is 0.

v2: use _mesa_lookup...(...) != NULL

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-06 18:01:38 +01:00
Samuel Pitoiset
5ae54c0cf7 getteximage: avoid to lookup textures with id 0
This fixes the following assertion when the key is 0.

main/hash.c:181: _mesa_HashLookup_unlocked: Assertion `key' failed.

Fixes: 633c959fae ("getteximage: Return correct error value when texure object is not found")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-06 18:01:38 +01:00
Marek Olšák
5ac6ab701f docs/relnotes/17.1.0: document the new LLVM requirement 2017-03-06 17:35:36 +01:00
Marek Olšák
c416d8a3bc gallium/radeon: don't monitor SDMA busyness on EG/Cayman/SI
It's always busy.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99955

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
7e1faa79d3 radeonsi: drop support for LLVM 3.6 & 3.7
They are too old.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
d5d74fe2b5 radeonsi: set the convergent attribute where needed
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
ef883fc554 gallivm,ac: add LP_FUNC_ATTR_CONVERGENT
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Marek Olšák
9b08f044be radeonsi: fix LLVM 3.9 - don't use non-matching attributes on declarations
Call site attributes are used since LLVM 4.0.

This also reverts commit b19caecbd6
"radeon/ac: fix intrinsic version check", because this is the correct fix.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 14:13:04 +01:00
Mark Thompson
6398a09213 st/omx: Set end-of-frame flag on bitstream output buffers
Since all output buffers are whole frames, this should always be set.

Technically, setting this flag is is optional (see OpenMAX IL section
3.1.2.7.1), but some clients assume that it will be used and
therefore buffer indefinitely thinking that all output buffers are
fragments of the first frame when it is not set.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-06 14:05:43 +01:00
Mark Thompson
6d95358aac st/omx: Fix port format enumeration
From OpenMAX IL section 4.3.5:
"The value of nIndex is the range 0 to N-1, where N is the number of
formats supported by the port.  There is no need for the port to
report N, as the caller can determine N by enumerating all the
formats supported by the port.  Each port shall support at least one
format.  If there are no more formats, OMX_GetParameter returns
OMX_ErrorNoMore (i.e., nIndex is supplied where the value is N or
greater)."

Only one format is supported, so N = 1 and OMX_ErrorNoMore should be
returned if nIndex >= 1.  The previous code here would return the
same format for all values of nIndex, resulting in an infinite loop
when a client attempts to enumerate all formats.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-06 14:05:17 +01:00
Mark Thompson
0798fddb50 st/va: Fix forward/backward referencing for deinterlacing
The VAAPI documentation is not very clear here, but the intent
appears to be that a forward reference is forward from a frame in the
past, not forward to a frame in the future (that is, forward as in
forward prediction, not as in a forward reference in source code).
This interpretation is derived from other implementations, in
particular the i965 driver and the gstreamer client.

In order to match those other implementations, this patch swaps the
meaning of forward and backward references as they currently appear
for motion-adaptive deinterlacing.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-03-06 14:05:05 +01:00
Mark Thompson
c93a157078 st/va: Support fractional framerate in misc parameter
Signed-off-by: Mark Thompson <sw@jkqxz.net>
Acked-by: Christian König <christian.koenig@amd.com>
2017-03-06 14:04:59 +01:00
Andy Furniss
012b6d3fe7 st/va encode handle ntsc framerate rate control
Tested with ffmpeg and gst-vaapi. Without this bits per
frame is set way too low for fractional framerates.

v2: Mark Thompson: simplify calculation.
    Use float.

Signed-off-by: Andy Furniss <adf.lists@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
2017-03-06 14:04:24 +01:00
Bas Nieuwenhuizen
f3dc318464 radv: Use the new L2 writeback flag.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 09:16:05 +01:00
Bas Nieuwenhuizen
66e12d4073 radv: Add L2 writeback.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 09:15:51 +01:00
Timothy Arceri
6b657cecd5 util/disk_cache: fix make check
Fixes make check after 11f0efec2e which caused disk cache
to create an additional directory.
2017-03-06 16:39:55 +11:00
Dave Airlie
2e73ccb485 radv/ac: use bitfield extract new intrinsics.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 15:27:33 +10:00
Dave Airlie
9c7309b09b radv/ac: move to new kill build.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 15:27:33 +10:00
Dave Airlie
a2652719f3 radv/ac: move to using new export intrinsics.
This uses the new code in build to do exports.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 15:27:33 +10:00
Dave Airlie
2830ece0fc radv/ac: switch to new intrinsics for pkrtz and clamp.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 15:27:32 +10:00
Dave Airlie
cc59e24a6b radv: drop Z24 support.
This isn't exposed in -pro, the hw docs say it is deprecated,
so let's not bother with it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-05 23:32:36 +00:00
Grazvydas Ignotas
6aaadd8728 radv: use VK_NULL_HANDLE for handles
Avoids warnings on 32bit.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-06 00:10:42 +01:00
Grazvydas Ignotas
a5446e3187 radv: check for upload alloc failure
Mainly to avoid gcc's complains about uninitialized ptr and offset use
later in that code.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-06 00:10:42 +01:00
Grazvydas Ignotas
666fe622e1 radv: don't use uninitialized value on failure
Mainly to avoid a warning.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-06 00:10:42 +01:00
Grazvydas Ignotas
5458b02305 radv: avoid casting warnings on 32bit
Use the same helpers as for other handle<->pointer conversions.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-06 00:10:42 +01:00
Bas Nieuwenhuizen
fb7e4e16e7 radv/amdgpu: Add some debug flags.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 00:10:23 +01:00
Bas Nieuwenhuizen
682248db45 radv: Cache command buffers in command pool.
So that we don't keep allocating BOs for the IBs and upload buffers.

We run some risk of memory increase with e.g. a bimodal size
distribution of command buffers, but I haven't noticed a significant
increase with dota2 and talos.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 00:07:51 +01:00
Timothy Arceri
e3a01a5d1b Revert "glsl: Switch to disable-by-default for the GLSL shader cache"
This reverts commit 0f60c6616e.

Piglit and all games tested so far seem to be working without
issue. This change will allow wide user testing and we can decided
before the next release if we need to turn it off again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-06 09:38:07 +11:00
Timothy Arceri
ee8d2e2804 docs: update envvars.html to reflect having a cache per arch 2017-03-06 09:33:20 +11:00
Timothy Arceri
11f0efec2e util/disk_cache: support caches for multiple architectures
Previously we were deleting the entire cache if a user switched
between 32 and 64 bit applications.

V2: make the check more generic, it should now work with any
platform we are likely to support.

V3: Use suggestion from Emil to make even more generic/fix issue
with __ILP32__ not being declared on gcc for regular 32-bit builds.

Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-03-06 09:27:01 +11:00
Grazvydas Ignotas
175d4aa8f5 util/disk_cache: mark read-only arguments const
No functional changes.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-06 09:23:17 +11:00
Dave Airlie
b19caecbd6 radeon/ac: fix intrinsic version check
Reported-by: 375gnu@gmail.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100068

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-06 06:05:58 +10:00
Bas Nieuwenhuizen
a247215469 radv: Merge fast clear flushes.
Don't flush multiple times if we clear multiple attachments. Also allows
doing the depth clear in parallel with the fast color clears.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-05 20:40:31 +01:00
Tim Rowley
a01a104216 relnotes: [swr] note addition of gs, increased llvm requirement
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-05 07:33:49 -06:00
Tim Rowley
bb8a4242ff docs: update features.txt for swr geometry shaders
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-05 07:33:49 -06:00
Tim Rowley
c307092557 swr: [rasterizer core] fix primID provoking vertex for GS
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-05 07:33:49 -06:00
Tim Rowley
f1d7284117 swr: implement geometry shaders
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-05 07:33:49 -06:00
Tim Rowley
08a82363ba configure.ac: increase required swr llvm to 3.9.0
GS implementation uses the masked.{gather,store} intrinsics,
introduced in llvm-3.9.0.  swr llvm version requirement in
automake and scons now match (scons already needed >= 3.9).

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-03-05 07:33:49 -06:00
Kenneth Graunke
6f71d9adc1 i965: Clamp texture buffer size to GL_MAX_TEXTURE_BUFFER_SIZE.
The OpenGL 4.5 specification's description of TexBuffer says:

"The number of texels in the texture image is then clamped to an
 implementation-dependent limit, the value of MAX_TEXTURE_BUFFER_SIZE."

We set GL_MAX_TEXTURE_BUFFER_SIZE to 2^27.  For buffers with a byte
element size, this is the maximum possible size we can encode in
SURFACE_STATE.  If you bind a buffer object larger than this as a
texture buffer object, we'll exceed that limit and hit an isl assert:

   assert(num_elements <= (1ull << 27));

To fix this, clamp the size in bytes to MaxTextureSize / texel_size.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-04 22:46:50 -08:00
Emil Velikov
eaf4a106bd automake: move wayland-drm prior to Vulkan
Earlier commit was picked from a larger series, but did not consider
that it removed the vulkan <> wayland-drm interdependency.

Rather than reverting everything, temporarily move wayland-drm further
up to resolve the issue. Since it [wayland-drm] does not have any
in-mesa dependencies that's perfectly safe.

Cc: Vedran Miletić <vedran@miletic.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100060
Fixes: e135ce6f08 ("vulkan: Build common Vulkan code earlier")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Javier Jardón <jjardon@gnome.org>
2017-03-04 23:44:14 +00:00
Mauro Rossi
6facb0c08f android: fix libz dynamic library dependencies
Fixes a series of libz related building errors:

target SharedLib: gallium_dri_32
(out/target/prod...SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so)
external/elfutils/libelf/elf_compress.c:117: error: undefined reference to 'deflateInit_'
...
external/elfutils/libelf/elf_compress.c:244: error: undefined reference to 'inflateEnd'
clang++: error: linker command failed with exit code 1 (use -v to see
invocation)

Fixes: 85a9b1b "util/disk_cache: compress individual cache entries"
2017-03-04 21:47:26 +00:00
Timothy Arceri
28fd6556c3 svga: pass NULL to ureg_get_tokens()
The number of tokens in never used and the pointer is NULL checked
so just pass NULL.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-03-05 08:15:51 +11:00
Ilia Mirkin
8e6d67685e nvc0: take extra pushbuf space into account for pushbuf_space calls
See detailed explanation of why this is needed in commit eb60a89bc3.
This spot was missed/overlooked. Basically as a result of the fact
that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8
to the requested amount, we have to be mindful of that when doing bare
nouveau_pushbuf_space calls.

Reportedly this fixes some crashes when replaying a hitman trace taken
on radeonsi.

Fixes: eb60a89bc3 ("nouveau: take extra push space into account for pushbuf_space calls")
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Karol Herbst <nouveau@karolherbst.de>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-04 17:48:27 +01:00
Ilia Mirkin
32dd8d59b6 nvc0: increase alignment to 256 for texture buffers on fermi
When binding as textures, the alignment can be 16. However when binding
as an image, the address has to be aligned to 256. (Also when binding as
an RT, but that can't happen with GL or current gallium APIs.)

Reported-by: Roy Spliet <nouveau@spliet.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-04 17:48:27 +01:00
Tapani Pälli
66b62be4bb android: fix outdir for gen_enum_to_str files
when files are being generated the value of $intermediates var content can be
completely random, this makes sure that outdir is the wanted one.

Fixes: 3f2cb699 ("android: vulkan: add support for libmesa_vulkan_util")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-04 16:38:33 +00:00
Xiaosong Wei
2acc69da8c EGL/Android: Add EGL_EXT_buffer_age extension
This patch implements the EGL_EXT_buffer_age extension for Android.
https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_buffer_age.txt

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-04 16:37:12 +00:00
Emil Velikov
2b1e22f9d8 docs: add news item and link release notes for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-04 15:56:58 +00:00
Emil Velikov
1b19304f3f docs: add sha256 checksums for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 5c9273152c)
2017-03-04 15:55:10 +00:00
Emil Velikov
6a4f6a49d4 docs: add release notes for 17.0.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 8fee1d348c)
2017-03-04 15:55:09 +00:00
Emil Velikov
1cd4fde053 gallium/targets: don't leave an empty target directory(ies)
Some drivers do not support certain targets - for example nouveau
doesn't do VAAPI, while freedreno doesn't do of the video backends.

As such if we enter vdpau when building freedreno/ilo/etc, a vdpau/
folder will be created, empty library will be build and almost
immediately removed. Thus keeping an empty vdpau/ folder around.

There are two ways to fix this.

 * add substantial tracking in configure/makefiles so that we never end
up in targets/vdpau
 Downsides:
Error prone, as the configure checks and the 'include
gallium/drivers/foo/Automake.inc' can easily get out of sync.

 * remove the folder, if empty, alongside the empty library.
 Downsides:
In the latter case vdpau/ might be empty before the mesa build has
started, yet we'll remove it either way.

This patch implements the latter option, as the downside isn't that
significant, plus the patch is way shorter ;-)

v2: use has_drivers to track since TARGET_DRIVERS can contain space,
hence neither string comparison nor -n/-z works correctly.

Gentoo Bugzilla: https://bugs.gentoo.org/545230
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-04 15:26:43 +00:00
Emil Velikov
342e5fdb64 radv: use enum_to_str util functions.
Port of e9dcb17962
vulkan/util: Add generator for enum_to_str functions

Cc: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-04 15:05:14 +00:00
Jason Ekstrand
e135ce6f08 vulkan: Build common Vulkan code earlier
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-04 14:46:53 +00:00
Jason Ekstrand
b3135c3cf3 anv: Advertise shaderInt64 on Broadwell and above
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-03 13:59:29 -08:00
Jason Ekstrand
bc456749bd nir/int64: Properly handle imod/irem
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder.  They just leave the behavior undefined
whenever either source is negative.  However, in SPIR-V, there is a
defined behavior for negative arguments.  This commit beefs up the pass
so that it handles both correctly.  Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:27 -08:00
Jason Ekstrand
9745bef308 nir/builder: Add an int64 immediate helper
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:24 -08:00
Kenneth Graunke
46cd549c2b genxml: Fill out Gen4 and G45 XML.
This is a work in progress - some things may still need fixing.
But it should be in pretty decent shape.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-03 10:23:17 -08:00
Marek Olšák
7f1446a8a1 ac: normalize build helper names
s/emit/build/

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
8bde7fb3fc ac: replace SI.vs.load.input with amdgcn.buffer.load.format
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
94811dc66c radeonsi: move SI.vs.load.input building into amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 17:30:07 +01:00
Marek Olšák
52660484c1 radeonsi: detect and mark loads/stores from read-only/write-only memory 2017-03-03 17:29:56 +01:00
Marek Olšák
97e21cfa25 ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0
ADD_TID doesn't work. Needs more investigation.

v2: remove leftover dead code

Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
2017-03-03 15:29:30 +01:00
Marek Olšák
684339827c radeonsi: use the writeonly LLVM attribute 2017-03-03 15:29:30 +01:00
Marek Olšák
8cfdbba6c7 ac: remove offen parameter from ac_build_buffer_store_dword
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
1bc88c02c0 radeonsi: enable TC L2 for tessellation offchip stores
Vulkan does the same thing.
2017-03-03 15:29:30 +01:00
Marek Olšák
27439dfdae radeonsi: merge and simplify tbuffer_store functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
b46e412c2e radeonsi: set noalias on input shader pointers 2017-03-03 15:29:30 +01:00
Marek Olšák
d4324ddb89 radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfe
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
9c09592086 radeonsi: move kill intrinsic building into amd/common
just a cleanup

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
e729dc7c46 radeonsi: set readnone on reads from read-only memory 2017-03-03 15:29:30 +01:00
Marek Olšák
25c7969a5a radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.load 2017-03-03 15:29:30 +01:00
Marek Olšák
653ac0b389 radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtz 2017-03-03 15:29:30 +01:00
Marek Olšák
4b2e5b9389 ac: replace old image intrinsics with new ones
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
c6a3911e5d radeonsi: remove last use of llvm.SI.resinfo
and move one function up to reuse the code.
2017-03-03 15:29:30 +01:00
Marek Olšák
ad18d7f040 radeonsi: move image intrinsic building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
2b3ebe307c ac: replace SI.export with amdgcn.exp.*
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
369f4a8726 radeonsi: move llvm.SI.export building to amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
9af03318aa ac: unify build_type_name_for_intr functions
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
f8c823b103 radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as well
It was harmless, because we also set unorm in the sampler state.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
b5744310d4 gallivm, ac: add writeonly and inaccessiblememonly attributes
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Marek Olšák
455c79b24f tgsi/scan: record load/store/atomic image usage
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-03 15:29:30 +01:00
Eric Anholt
3958c01762 glapi: Fix a comment typo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-03 20:29:12 +11:00
Alejandro Piñeiro
a54f0ad6d3 mesa/main: *TextureSubImage* generates INVALID_OPERATION on wrong target
Equivalent *TexSubImage* methods generates INVALID_ENUM.

From OpenGL 4.5 spec, section 8.6 Alternate Texture Image
Specification Commands:

   "An INVALID_ENUM error is generated by *TexSubImage* if target does
    not match the command, as shown in table 8.15."

And:

   "An INVALID_OPERATION error is generated by *TextureSubImage* if
    the effective target of texture does not match the command, as
    shown in table 8.15."

Fixes:
GL45-CTS.direct_state_access.textures_copy_errors

v2: slightly change commit summary (Samuel)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-03 08:14:53 +01:00
Ben Widawsky
d844d8e4d5 i965: Add Kaby Lake brandstrings
While here, use the spacing defined in Ark.
https://ark.intel.com/products/codename/82879/Kaby-Lake

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-03-02 21:00:02 -08:00
Grazvydas Ignotas
4dc42ae792 tgsi/ureg: return correct token count in ureg_get_tokens
Valgrind reports that the shader cache writes uninitialized data to disk.
Turns out ureg_get_tokens() is returning the count of allocated tokens
instead of how many are actually used, so the cache writes out unused
space at the end. Use the real count instead.

This change should not cause regressions elsewhere because the only
ureg_get_tokens() user that cares about token count is the shader cache.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-03 12:11:55 +11:00
Timothy Arceri
6084855528 radeonsi: add support for an on-disk shader cache
V2:
- when loading from disk cache also binary insert into memory cache.
- check that the binary loaded from disk is the correct size. If not
  delete the cache item and skip loading from cache.

V3:
- remove unrequired variable

Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-03 12:09:08 +11:00
Timothy Arceri
85a9b1b562 util/disk_cache: compress individual cache entries
This reduces the cache size for Deus Ex from ~160M to ~30M for
radeonsi (these numbers differ from Grigori's results below
probably due to different graphics quality settings).

I'm also seeing the following improvements in minimum fps in the
Shadow of Mordor benchmark on an i5-6400 CPU@2.70GHz, with a HDD:

no-cache:                    ~10fps
with-cache-no-compression:   ~15fps
with-cache-and-compression:  ~20fps

Note: The with cache results are from the second run after closing
and opening the game to avoid the in-memory cache.

Since we mainly care about decompression I went with
Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson
who has benchmarked decompression speeds.

Grigori Goronzy provided the following stats for Deus Ex: Mankind
Divided start-up times on a Athlon X4 860k with a SSD:

No Cache                                 215 sec

Cold Cache zlib BEST_COMPRESSION         285 sec
Warm Cache zlib BEST_COMPRESSION         33 sec

Cold Cache zlib BEST_SPEED               264 sec
Warm Cache zlib BEST_SPEED               33 sec

Cold Cache no compression                266 sec
Warm Cache no compression                34 sec

The total cache size for that game is 48 MiB with BEST_COMPRESSION,
56 MiB with BEST_SPEED and 170 MiB with no compression.

These numbers suggest that it may be ok to go with Z_BEST_SPEED
but we should gather some actual decompression times before doing
so. Other options might be to do the compression in a separate
thread, this might allow us to use a higher compression algorithim
such as LZMA.

Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-03 12:09:08 +11:00
Timothy Arceri
5afde61752 util/disk_cache: add support for detecting corrupt cache entries
V2: fix pointer increments for writing/reading crc

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
2017-03-03 12:09:08 +11:00
Samuel Pitoiset
9fc86d4f53 glsl: fix subroutine mismatch between declarations/definitions
Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.

Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.

This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert

Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-03 00:57:57 +01:00
Matt Turner
10f2c86aa3 genxml: Depend on Makefile.am for generated sources.
Depending on the generated Makefile means that all generated sources are
recreated after ./configure.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-02 15:49:00 -08:00
Matt Turner
7d1195c1e4 clover: Work around build failure with AltiVec.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504
Acked-by: Francisco Jerez <currojerez@riseup.net>
2017-03-02 15:49:00 -08:00
Nanley Chery
d7d64f1091 anv/image: Allow HiZ on input attachment-capable depth/stencil images
While an input attachment may only take on one of those two layouts,
other depth/stencil attachments that use the same image may have
HiZ-enabled layouts. Improves the average frame rate on a release
candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my
SKL GT4.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
76b8cc2a1c anv/cmd_buffer: Centralize automatic layout transitions
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
0a72b5f3cb anv/cmd_buffer: Add attachment transitioning functions
This is needed to transition input attachments.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
9950774f8b anv/blorp: Encapsulate subpass id querying
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
c78a959bcf anv/cmd_buffer: Enable render pass awareness
v2: Update cmd_state_reset (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
c0223d052b anv/pass: Store subpass attachment reference list
We'll loop through this array when performing automatic layout
transitions.

v2: Adjust formatting of an assignment (Jason Ekstrand)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
8f6a17c8e7 anv/pass: Fix size of anv_render_pass:subpass_attachments
Don't allocate space for resolve attachments if the subpass has none.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
608d17b80e anv: Store the user's VkAttachmentReference
We will be using the image layout. Store the full struct directly from
the user.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
6326f0f4be anv/cmd_buffer: Remove extra resolve for certain depth buffers
Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer
when reading from a depth image subresource that is in the general
layout. Remove this unneeded resolve.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
ea744912b3 anv/cmd_buffer: Conditionally choose the sampled image surface state
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
5408d3fd05 anv/descriptor_set: Store aux usage of sampled image descriptors
v2: Rebase onto latest changes
v3: Account for NULL image_view in aux_usage assignment

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:55 -08:00
Nanley Chery
efc2222323 anv/image: Create an additional surface state for sampling
This will be used to sample a depth input attachment without having to
pass through the HiZ buffer.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
f3621f4e71 anv/image: Simplify setup of HiZ sampler surface state
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
258af3a856 anv/image: Remove extra dependency on HiZ-specific variable
surf_usage is only useful to image views that may use HiZ buffers.
Storage image views don't use HiZ buffers.

v2: Update commit message and add an assertion.

Fixes: 055ff2ec52 ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ")
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
54d29ee65f anv: Update the HiZ sampling helper
Validate the inputs, verify that this image has a depth
buffer, use gen_device_info instead of

v2:
- Add parenthesis (Jason Ekstrand)
- Make parameters const
- Use gen_device_info instead of gen
- Pass aspect to missed function in transition_depth_buffer

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
172747a963 anv/cmd_buffer: Replace layout_to_hiz_usage()
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
425e33bcdb anv/image: Add anv_layout_to_aux_usage()
This function supersedes layout_to_hiz_usage().

v2:
- Don't find the optimal buffer for layout transitions (Jason Ekstrand).
- Pass the devinfo instead of the gen (Jason Ekstrand)
- Update the function documentation.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Nanley Chery
178f9e5f29 anv/pass: Avoid accessing attachment array out of bounds
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 13:17:54 -08:00
Jonas Pfeil
cd2b55e536 ralloc: Make sure ralloc() allocations match malloc()'s alignment.
The header of ralloc needs to be aligned, because the compiler assumes
that malloc returns will be aligned to 8/16 bytes depending on the
platform, leading to degraded performance or alignment faults with ralloc.

Fixes SIGBUS on Raspberry Pi at high optimization levels.

This patch is not perfect for MSVC, as maybe in the future the alignment
for the most demanding data type might change to more than 8.

v2: Commit message reword/typo fix, and add a bigger explanation in the
    code (by anholt)

Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-03-02 13:01:45 -08:00
Bruce Cherniak
a7b8d50bcb swr: fix crash in swr_update_derived following st/mesa state changes
Recent change to st/mesa state update logic caused major regressions to
swr validation code.

swr uses the same validation logic (swr_update_derived) for both draw
and Clear calls.  New st/mesa state update logic results in certain state
objects not being set/bound during Clear.  This was causing null ptr
exceptions.  Creation of static dummy state objects allows setting these
pointers during Clear validation, without interfering with relevant state
validation.

Once fixed, new logic also highlighted an error in dirty bit checking for
fragment shader and clip validation.

(The alternative is to have a simplified validation routine for Clear.
Which may do that at some point.)

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-03-02 13:39:56 -06:00
Bruce Cherniak
74aa6fd9a0 docs: update features.txt for GL_ARB_clear_texture with swr
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-03-02 13:39:56 -06:00
Bruce Cherniak
dd649a541d swr: enable clear_texture with util_clear_texture
Passes corresponding piglit tests.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-02 13:39:52 -06:00
Gregory Hainaut
b36050143f doc: GL_ARB_buffer_storage is supported on llvmpipe/swr
At least, the extension is exported (gallium capability
PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT is 1)

Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-02 17:31:04 +00:00
Emil Velikov
b23db2b840 automake: i965: list correct header in Makefile.source
Fixes: 7ac47b1af7 ("i965: Add a header for brw_vec4_vs_visitor")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-02 17:30:33 +00:00
Brian Paul
b95ead850b svga: fix crash regression since e027935a79
During the first update of the hw_clear_state atoms, we may not yet
have a current rasterizer state object.  So, svga->curr.rast may be
NULL and we crash.

Add a few null pointer checks to work around this.  Note that these
are only needed in the state update functions which are called for
'clear' validation.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-03-02 10:11:19 -07:00
Brian Paul
69fb8f3cae svga: s/unsigned/pipe_prim_type/
And add some default switch cases to silence compiler warnings.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2017-03-02 10:11:19 -07:00
Brian Paul
a9ff377d40 svga: whitespace fixes in svga_context.h
Trivial.
2017-03-02 10:11:13 -07:00
Brian Paul
49134c0549 svga: whitespace and formatting fixes in svga_stage.c
Trivial.
2017-03-02 10:11:04 -07:00
Robert Foss
88becf7302 mesa: Avoid read of uninitialized variable
The is_color_attachement variable is later read when handling two
separate error cases, where only one of the cases results in the
variable being initialized.

This can be avoided by giving the variable a safe default value.

Coverity-Id: 1398631
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-02 15:45:19 +00:00
Lionel Landwerlin
af5f13e58c anv: add VK_KHR_descriptor_update_template support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Lionel Landwerlin
9f60ed98e5 anv: add VK_KHR_push_descriptor support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Lionel Landwerlin
12dee851a3 anv: descriptor: make descriptor writing take a stream allocator
This allows us to allocate surface states from the command buffer when
pushing descriptor sets rather than allocating them through a
descriptor set pool.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Lionel Landwerlin
194fa58285 anv: descriptors: extract writing of descriptors elements
This will be reused later on.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Lionel Landwerlin
c2d199adec anv: make layout size computation helper available across compilation units
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Lionel Landwerlin
c83e33e6ee anv: move buffer_view declaration
We will need this declaration closer for readability later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 10:34:06 +00:00
Tomasz Figa
06758c1e8a mesa: Use _mesa_has_OES_geometry_shader() when validating draws
In validate_DrawElements_common() we need to check for OES_geometry_shader
extension to determine if we should fail if transform feedback is
unpaused. However current code reads ctx->Extensions.OES_geometry_shader
directly, which does not take context version into account. This means
that if the context is GLES 3.0, which makes the OES_geometry_shader
inapplicable, we would not validate the draw properly. To fix it, let's
replace the check with a call to _mesa_has_OES_geometry_shader().

Fixes following dEQP tests on i965 with a GLES 3.0 context:

dEQP-GLES3.functional.negative_api.vertex_array#draw_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced
dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements
dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-02 00:37:17 -08:00
Kenneth Graunke
58793e514b i965: Replace BRW_SURFACEFORMAT_* with ISL_FORMAT_*.
One less set of enums.  Dropped the #defines from brw_defines.h and ran:

$ for file in *.cpp *.c *.h; do sed -i \
      -e 's/BRW_SURFACEFORMAT_/ISL_FORMAT_/g' \
      -e 's/ISL_FORMAT_ASTC_[A-Zxs0-9_]*/\U&/g' $file; \
  done

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 00:30:45 -08:00
Chris Wilson
92281b2c7f i965: Only flush the batchbuffer if we need to zero the SO offsets
If we don't have pipelined register access (e.g. Haswell before kernel
v4.2), then we can only implement EXT_transform_feedback by reseting the
SO offsets *between* batches. However, if we do have pipelined access to
the SO registers on gen7, we can simply emit an inline reset of the SO
registers without a full batch flush.

v2 [by Ken]: Simplify after recent kernel feature detection changes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-02 00:30:41 -08:00
Iago Toral Quiroga
7ad692d8e2 anv: do not subtract the base layer to compute depth in 3DSTATE_DEPTH_BUFFER
According to the PRM description of the Depth field:

  "This field specifies the total number of levels for a volume texture
   or the number of array elements allowed to be accessed starting at the
   Minimum Array Element for arrayed surfaces"

However, ISL defines array_len as the length of the range
[base_array_layer, base_array_layer + array_len], so it already represents
a value relative to the base array layer like the hardware expects.

v2: Depth is defined as a U11-1 field, so subtract 1 from
    the actual value (Jason)

This fixes a number of new CTS tests that would crash otherwise:
dEQP-VK.pipeline.render_to_image.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 09:04:03 +01:00
Iago Toral Quiroga
64bf78270d isl: document the meaning of the array_len field in isl_view
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-02 09:03:42 +01:00
Jacob Lifshay
3d8feb38e8 vulkan/wsi: Improve the DRI3 error message
This commit improves the message by telling them that they could probably
enable DRI3.  More importantly, it includes a little heuristic to check
to see if we're running on AMD or NVIDIA's proprietary X11 drivers and,
if we are, doesn't emit the warning.  This way, users with both a discrete
card and Intel graphics don't get the warning when they're just running
on the discrete card.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99715
Co-authored-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Rene Lindsay <rjklindsay@hotmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
2017-03-01 19:11:47 -08:00
Jason Ekstrand
424ac809bf i965: Do int64 lowering in NIR
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
074f5ba0b5 nir: Add a simple int64 lowering pass
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.

v2: Properly handle vectors

v3: Get rid of log2_denom stuff.  Since we're using bcsel, we do all the
    calculations anyway and this is just extra instructions.

v4:
 - Add back in the log2_denom stuff since it's needed for ensuring that
   the shifts don't overflow.
 - Rework the looping part of the pass to be easier to expand.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
86e749b1ad spirv: Use nir_builder for control flow
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
95972cd4fd nir/lower_indirect: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
3ce8eeb5a1 nir/lower_gs_intrinsics: Use nir_builder control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
c75f965ab7 glsl/nir: Use nir_builder's new control-flow helpers
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
e27c716ad7 nir/builder: Add support for easily building control-flow
Each of the pop functions (and push_else) take a control flow parameter as
their second argument.  If NULL, it assumes that the builder is in a block
that's a direct child of the control-flow node you want to pop off the
virtual stack.  This is what 90% of consumers will want.  The SPIR-V pass,
however, is a bit more "creative" about how it walks the CFG and it needs
to be able to pop multiple levels at a time, hence the argument.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-03-01 17:00:20 -08:00
Jason Ekstrand
d5b355ce5f i965: Move intel_debug.h to intel/common/gen_debug.h
This is shared between the Vulkan and GL drivers as it's a requirement
of the back-end compiler.  However, it doesn't really belong in the
compiler.  We rename the file to match the prefix of the other stuff in
common and because libdrm defines an intel_debug.h and this avoids a
pile of possible name conflicts.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:14:03 -08:00
Jason Ekstrand
8048c1953c i965: Reduce cross-pollination between the DRI driver and compiler
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:03 -08:00
Jason Ekstrand
a2195e561a i965: Move select_clip_planes to brw_vs.c
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-01 16:14:03 -08:00
Jason Ekstrand
818bfdfa15 i965: Delete brw_do_cubemap_normalize
This hasn't been used for quite some time now but we never bothered to
get rid of it when we dropped GLSL IR support for vec4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:03 -08:00
Jason Ekstrand
7ac47b1af7 i965: Add a header for brw_vec4_vs_visitor
brw_vs.h is not a compiler file but brw_vec4_visitor is definitely a
compiler thing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
1c318af743 i965: Move a bunch of pre-compile and link stuff to brw_program.h
It's all GL-specific and brw_program.h is not part of i965_compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
fbb9171968 i965: Move image uniform setup to brw_nir_uniforms.cpp
It's the only thing that's using it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
820ae39725 i965: Move channel_expressions and vector_splitting to brw_program.h
They're GL-specific.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
760c8a1d95 i965: Make mark_surface_used a static inline in brw_compiler.h
One of these days, I'd like to see this function go away all together
but for now, let's at least put it near the struct it updates.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
f33d2b5d05 i965: Move BRW_ATTRIB_WA_* defines to brw_compiler.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
4e274bcf66 i965: Move BRW_MAX_DRAW_BUFFERS to brw_compiler.h
It does sort-of go with MAX_UBO and friends but MAX_DRAW_BUFFERS is an
actual hardware constant based on the number of things we can blend
rather than an arbitrary "number of things allowed in GL" like some of
the other maximums are.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
2523241660 i965/inst: Stop using fi_type
It's a mesa define that's trivial to inline.  This removes a dependence
on main/imports.h.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
ffeb738112 i965: Move brw_register_blocks to brw_fs.cpp
Its one and only caller is brw_compile_fs which lives there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
5b87c7e0e3 i965: Move SHADER_TIME_STRIDE to brw_compiler.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
f85ef11501 i965: Move SOL binding #defines to brw_compiler.h
While we're at it, we also change the GEN6 binding macro to be a start
index that gets added to the binding.  This makes things a bit more
explicit.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:02 -08:00
Jason Ekstrand
81e5bdf072 i964/gs: Move MAX_GS_INPUT_VERTICES to brw_vec4_gs_visitor.h
It's only users are in brw_vec4_gs_visitor and gen6_vec4_gs_visitor.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:01 -08:00
Jason Ekstrand
c6a719b64f i965/gs: Add the gl_prim_to_hw_prim table to vec4_gs_visitor.cpp
It's currently in brw_util.c but that's the only bit of brw_util.c
that's shared between the compiler and the rest of the GL driver.
It's just a fairly obvious table so the duplication isn't bad.  It's
certainly less pain than trying to figure out how to share the code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:01 -08:00
Jason Ekstrand
035616cb8e i965: Don't use MAX_SURFACES in mark_surface_used
Vulkan doesn't respect MAX_SURFACES so this assert isn't valid in that
case.  It should, however, assert that it isn't insanely large.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:01 -08:00
Jason Ekstrand
0d2c9ce1ce i965: Get rid of BRW_PRIM_OFFSET
This is a relic of when we wired up meta to be able to use RECTLIST
primitives.  It's no longer needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-01 16:14:01 -08:00
Jason Ekstrand
406321caeb i965/vue_map: Stop using GLbitfield types
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:58 -08:00
Jason Ekstrand
45d3dbebb2 i965: Move assign_common_binding_table_offsets to brw_program
This isn't used by Vulkan and is specific to the way the GL driver
works.  There's no reason to have it in common compiler code.  Also, it
relies on BRW_MAX_* defines which are defined in brw_context.h

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:55 -08:00
Jason Ekstrand
8123402fd1 i965: Move some gen4 WM defines to brw_compiler.h
These go in wm_prog_key so they're part of the compiler interface.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:27 -08:00
Jason Ekstrand
34ede38194 i965: Move brw_disassemble_inst to brw_eu.h
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:26 -08:00
Jason Ekstrand
f9c9d551ea i965: Move some helpers from brw_context.h to brw_shader.h
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:24 -08:00
Jason Ekstrand
b97782c364 i965: Move a couple of #defines from brw_context to brw_compiler
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-03-01 16:13:09 -08:00
Jason Ekstrand
2c58709023 glsl/int64: Fix a typo in imod64
The zy swizzle gives us one component of quotient and one component of
remainder.  What we wanted was zw for the remainder.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-01 15:31:44 -08:00
Jason Ekstrand
e647c4fbd9 util/build-id: Return a pointer rather than copying the data
We're about to use the build-id as the starting point for another SHA1
hash in the Intel Vulkan driver, and returning a pointer is far more
convenient.

Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-03-01 15:31:44 -08:00
Jason Ekstrand
e3d33a23e6 anv: Properly handle destroying NULL devices and instances
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>
2017-03-01 15:31:44 -08:00
Robert Bragg
f3ec9d33c6 mesa: Fix performance query id check
The queryid_valid() function asserts that an ID given by an application
isn't zero since the spec explicitly reserves an ID of zero as invalid.

The implementation was written as if the ID was a signed integer and
based on the assumption that queryid_to_index() is simply subtracting
one from the ID. It was broken because in fact the ID was stored in an
unsigned int and testing for an index >= 0 would always succeed.

This adds a spec quote to clarify why zero is considered invalid and
checks for zero before even passing the ID to queryid_to_index() for
then checking the upper bound.

This is a v2 of a patch originally posted by Juha-Pekka (thanks)

Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-01 23:01:48 +00:00
Tobias Klausmann
6d600cf632 amd/common: Fix build with new ac_add_function_attr()
Fix usage of ac_add_function_attr() and make it known!

common/ac_nir_to_llvm.c: In function 'create_llvm_function':
common/ac_nir_to_llvm.c:265:4: error: implicit declaration of function
'ac_add_function_attr' [-Werror=implicit-function-declaration]
    ac_add_function_attr(main_function, i + 1, AC_FUNC_ATTR_BYVAL);
    ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-01 23:53:38 +01:00
Daniel Stone
a1727aa75e egl/wayland: Don't use DRM format codes for SHM
The wl_drm interface (akin to X11's DRI2) uses the standard set of DRM
FourCC format codes. wl_shm copies this, except for ARGB8888/XRGB8888,
which use their own definitions.

Make sure we only use wl_shm format codes when we're working with
wl_shm. Otherwise, using swrast with 32bpp formats would fail with an
error.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Daniel Stone <daniels@collabora.com> (v1)
Fixes: cb5e799448 ("egl/wayland: unify dri2_wl_create_surface implementations")

v2: [Emil Velikov: move to dri2_wl_create_window_surface]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com> (IRC)
2017-03-01 18:36:55 +00:00
Kenneth Graunke
c0e9e61c9a mesa: Drop unused STATE_TEXRECT_SCALE program statevars.
The last user is now gone.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2017-03-01 10:27:38 -08:00
Kenneth Graunke
f356d05393 i965: Drop unused STATE_TEXRECT_SCALE code.
In the past, we used this on Gen4-5 to transform non-normalized texture
coordinates (for sampler2DRect) to normalized ones.  We also used it on
Gen6-7.5 for sampler2DRect with GL_CLAMP.

Jason dropped this code in 6c8ba59cff
in favor of using nir_lower_tex(), which just does a textureSize()
call.  But we were still setting up these state references for
useless uniform data.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
2017-03-01 10:27:36 -08:00
Kenneth Graunke
4061bbccf2 egl: Ensure ResetNotificationStrategy matches for shared contexts.
Fixes:
dEQP-EGL.functional.robustness.negative_context.invalid_robust_shared_context_creation

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
2017-03-01 10:26:42 -08:00
Marek Olšák
940da36a65 gallivm,ac: add function attributes at call sites instead of declarations
They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic.
We need this to force readnone or inaccessiblememonly on some amdgcn
intrinsics.

This is only used with LLVM 4.0 and later. Intrinsics only used with
LLVM <= 3.9 don't need the LEGACY flag.

gallivm and ac code is in the same patch, because splitting would be
more complicated with all the LEGACY uses all over the place.

v2: don't change the prototype of lp_add_function_attr.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v1)
2017-03-01 18:59:36 +01:00
Marek Olšák
408f370710 gallivm,ac: remove unused FUNC_ATTR_LAST enums
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-03-01 18:59:36 +01:00
Nicolai Hähnle
40c77bbf83 st/mesa: inform the driver of framebuffer changes before compute dispatches
Even though compute shaders cannot access the framebuffer, there is a
synchronization issue when a compute dispatch accesses a texture that
was previously bound and drawn to as a framebuffer.

Section 9.3 (Feedback Loops Between Textures and the Framebuffer) of
the OpenGL 4.5 spec rather implicitly clarifies that undefined behavior
results if the texture is still attached to the currently bound
framebuffer. However, the feedback loop is broken when the application
changes the framebuffer binding before a compute dispatch, and the
state tracker needs to let the driver known about this.

Fixes GL45-CTS.compute_shader.pipeline-post-fs on SI family Radeons.

Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-01 18:59:36 +01:00
Nicolai Hähnle
911391bd70 st/glsl_to_tgsi: avoid iterating past the head of the instruction list
exec_node::get_prev() does not guard against going past the beginning
of the list, so we need to add explicit checks here.

Found by ASAN in piglit arb_shader_storage_buffer_object-rendering.

Cc: mesa-stable@lists.freedesktop.org

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-03-01 18:59:36 +01:00
Marc Dietrich
64b215223f r600g: fix build without opencl and static llvm libs
radeon_llvm_check and friends were never called in the no-opencl case,
which ended up with an empty llvm module list. As --enable-opencl always
requires --enable-llvm, we can use the latter as the guard.

Signed-off-by: Marc Dietrich <marvin24@gmx.de>
[Emil Velikov: commit message polish]
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-01 13:22:48 +00:00
Samuel Pitoiset
be8aa76afd glsl: remove unecessary flags.q.subroutine_def
This bit is definitely not necessary because subroutine_list
can be used instead. This frees one more bit in the flags.q
struct which is nice because arb_bindless_texture will need
4 bits for the new layout qualifiers.

No piglit regressions found (including compiler tests) with
"-t subroutine".

v2: set the subroutine flag for validating illegal flags

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-01 14:15:31 +01:00
Emil Velikov
ca7d2025a7 vulkan: provide vk.xml as argument to the python generator
Do not hardcode the file in the python script, but pass it via the build
system(s). The latter is the only one that should know about the file
location/tree structure.

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-28 18:53:04 +00:00
Emil Velikov
14281c9035 automake: vulkan: rename/reuse VULKAN_UTIL_{GENERATED_,}FILES list
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-28 14:13:09 +00:00
Mauro Rossi
3f2cb699cf android: vulkan: add support for libmesa_vulkan_util
The following changes are implemented:

Add src/vulkan/Android.mk to build libmesa_vulkan_util
Android.mk: add src/vulkan to SUBDIR to build new module
intel/vulkan: fix libmesa_vulkan_util,vk_enum_to_str.h dependencies
Add -o OUTPUT_PATH option in src/vulkan/util/gen_enum_to_str.py script
Use -o OUTPUT_PATH option in automake generation rules for vk_enum_to_str.{c,h}

Fixes: e9dcb17 "vulkan/util: Add generator for enum_to_str functions"
Fixes: 8e03250 "vulkan: Combine wsi and util makefiles"
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

[Emil Velikov]
 - Move parser within main()
 - Use --outdir instead of -o
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-28 01:24:41 +01:00
Emil Velikov
3bbbb63801 automake: r600: radeonsi: correctly manage libamd_common.la linking
Since both r600 and radeonsi use code from libamd_common they need to
static link it. At the same time, adding a common library to LIB_DEPS is
fragile [can lean to multiple symbol definitions] and non-obvious - I
had to do a double-take how things work atm.

So follow the libradeon.la approach and put common libraries in
TARGET_RADEON_COMMON

Fixes: 936f5407a7 ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-02-28 10:55:46 +00:00
Emil Velikov
8af447d6f0 glx/tests: automake: add dispatch-index-check to the tarball
Otherwise we'll fail at `make distcheck'

Fixes: 3cc33e7640 ("glx: add GLXdispatchIndex sort check")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-28 16:18:27 +00:00
Emil Velikov
3935690d58 automake: anv: add missing include $(top_srcdir)/src/vulkan/util
Otherwise we'll fail to find the header and `make distcheck` will bail.

Fixes: e9dcb17962 ("vulkan/util: Add generator for enum_to_str functions")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-28 14:08:17 +00:00
Samuel Iglesias Gonsálvez
0dddad5b1b i965/fs: emit MOV_INDIRECT with the source with the right register type
This was hiding bugs as it retyped the source to destination's type.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez
d8122128bc i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles
When generating the MOV INDIRECT instruction, the source type is ignored
and it is set to destination's type. However, this is going to change in a
later patch, so we need to explicitly set the proper source type.

brw_vec8_grf() creates an float type's fs_reg by default, when the
ICP handle is actually unsigned. This patch fixes these cases before
applying the aforementioned patch.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez
56266df7ed i965/fs: fix indirect load DF uniforms on BSW/BXT
The lowered BSW/BXT indirect move instructions had incorrect
source types, which luckily wasn't causing incorrect assembly to be
generated due to the bug fixed in the next patch, but would have
confused the remaining back-end IR infrastructure due to the mismatch
between the IR source types and the emitted machine code.

v2:
- Improve commit log (Curro)
- Fix read_size (Curro)
- Fix DF uniform array detection in assign_constant_locations() when
  it is acceded with 32-bit MOV_INDIRECTs in BSW/BXT.

v3:
- Move changes in assign_constant_locations() to other patch.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-01 06:50:35 +01:00
Samuel Iglesias Gonsálvez
a497ab6838 i965/fs: detect different bit size accesses to uniforms to push them in proper locations
Previously, if we had accesses with different sizes to the same uniform, we might not
push it aligned with the bigger one. This is a problem in BSW/BXT when we access
an array of DF uniform with both direct and indirect addressing because for the latter
we use 32-bit MOV INDIRECT instructions. However this problem can happen with other
generations and bitsizes.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-01 06:50:29 +01:00
Samuel Iglesias Gonsálvez
7427425247 i965/fs: mark last DF uniform array element as 64 bit live one
This bug can make that we don't detect the end of a contiguous area
correctly and push larger areas than the real ones.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-03-01 06:50:10 +01:00
Dave Airlie
e66be3d3bb radv: fix txs for sampler buffers
I messed this up when I wrote it, this fixes:
dEQP-VK.memory.pipeline_barrier.*uniform_texel_buffer.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-01 08:02:24 +10:00
Marek Olšák
8c838730d0 amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 21:44:30 +01:00
Bas Nieuwenhuizen
6e9fb1de7f radv: Don't allocate space for unused immutable samplers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:18 +01:00
Bas Nieuwenhuizen
137b06b437 radv/ac: Use constants for immutable samplers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:14 +01:00
Bas Nieuwenhuizen
500e6e40f6 radv: Detect if all immutable samplers for a binding are equal.
We can then use constants for indexed loads.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:48:10 +01:00
Bas Nieuwenhuizen
dd2a0c7aef radv: Store the immutable samplers as uint32_t[4].
So we don't need to know about radv_sampler in ac_nir_to_llvm.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-28 20:46:02 +01:00
Brendan King
884f65e185 egl/dri3: implement query surface hook
This is a DRI3 version of a change made for DRI2
(4d6d4f939e, "egl/dri2: implement query surface hook"),
that fixed failures in dEQP-EGL.functional.resize.surface_size.grow
and dEQP-EGL.functional.resize.surface_size.shrink.

Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Brendan King <Brendan.King@imgtec.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-28 10:11:42 +00:00
Michel Dänzer
936f5407a7 gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600
Fixes build failure with --enable-opencl --enable-xvmc:

make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc'
  CXXLD    libXvMCgallium.la
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read'
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'libXvMCgallium.la' failed

Fixes: dc4c551a34 ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()")
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-28 16:35:21 +09:00
Kenneth Graunke
b8cd78eaa1 i965: Move intel_resolve_map.[ch] from i965_compiler_FILES to i965_FILES
I have no idea why these were part of the compiler files.  They're
miptree related code, and the compiler doesn't appear to use them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-27 22:56:59 -08:00
Timothy Arceri
4d0d81379e gallium/r600: fix r600 build when OpenCL is enabled
Fixes build regression caused by d90bf4ef3e
2017-02-28 15:42:18 +11:00
Timothy Arceri
d90bf4ef3e radeon: remove unused radeon_elf_util.{c,h}
We now use the shared code in AMD common instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
503fb134e8 radeon/ac: switch to ac_shader_binary_config_start()
For radeonsi we could probably switch to
ac_shader_binary_read_config(). However the functions have
diverged so just share this helper for now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
f0aaa4b3a4 radeon/ac: make ac_shader_binary_config_start() available externally
The read config functions are different for r600 and radeonsi so
we can't just share the one in amd common. So just share this
instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
dc4c551a34 radeon/ac: switch from radeon_elf_read() to ac_elf_read()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
69a687189e radeon/ac: switch from radeon_shader_binary to ac_shader_binary
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Timothy Arceri
affc8314cb radeon/ac: add llvm_ir_string to ac_shader_binary struct
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-28 13:20:31 +11:00
Kenneth Graunke
63d1ebca3a ralloc: Delete autofree handling.
There was exactly one user of this, and I just removed it.

It also accessed an implicit global context, with no locking.  This
meant that it was only safe if all callers of ralloc_autofree_context()
held the same lock...which is a pretty terrible thing for a utility
library to impose.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-27 15:46:12 -08:00
Kenneth Graunke
aa8bb9fc15 compiler: Free types in _mesa_glsl_release_types() rather than autofree.
Instead of using ralloc_autofree_context() to install an atexit()
handler to ralloc_free(glsl_type::mem_ctx), we can simply free them
from _mesa_glsl_release_types().

This is effectively the same, because _mesa_glsl_release_types() is
called from _mesa_destroy_shader_compiler(), which is called from Mesa's
one_time_fini() function, which Mesa installs as an atexit() handler.

The one advantage here is that it ensures the built-in functions are
destroyed before the types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-27 15:46:12 -08:00
Jan Vesely
010fecb853 clover: Dump linked binary to a different file
this allows to pass the generated files directly to llc or bugpoint

v2: add atomic counter ID
v3: remove extra scope operator, constify

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-02-27 16:11:48 -05:00
Dave Airlie
800b82ea13 radv: fix depth format in blit2d.
For blitting we need to use the depth or stencil format, never
the combined.

This fixes:
dEQP-VK.texture.shadow.2d.nearest.less_or_equal_d32_sfloat_s8_uint
and a few others.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-28 06:11:54 +10:00
Dave Airlie
1121ce4525 radv/formats: add fast clear for 8-bit signed ints.
These formats are used by some CTS tests, may as well fill them in.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-28 06:11:50 +10:00
Samuel Pitoiset
ec623f77eb mesa/main: refactor sampler parameter error codepath
This is similar to what we do in the texture error codepath.
While we are at it, update the specification comment with
latest GL 4.5 spec.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-27 19:42:23 +01:00
Samuel Pitoiset
e69fd0b43c glsl: reject samplers not declared as uniform/function params earlier
This improves consistency with image variables and atomic
counters which are already rejected the same way.

Note that opaque variables can't be treated as l-values, which
means only the 'in' function parameter is allowed.

v2: rewrite commit message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
2017-02-27 19:42:00 +01:00
Samuel Pitoiset
08a052966f glsl: use is_sampler() anywhere it's possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-27 19:41:14 +01:00
Samuel Pitoiset
e12f4edf9c glsl: use is_image() anywhere it's possible
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-27 19:41:11 +01:00
Samuel Pitoiset
46562a062b glsl: add missing blend_support qualifier in validate_flags()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-02-27 19:40:12 +01:00
Samuel Pitoiset
87ee1729d0 glsl: use an enum for AMD_conservative_depth layout qualifiers
The main idea behind this is to free some bits in the flags.q
struct because currently all 64-bits are used and we can't
add more layout qualifiers without reaching a static assert.

In order to do that (mainly for ARB_bindless_texture), use an
enumeration for the AMD_conservative_depth layout qualifiers
because it's forbidden to declare more than one depth qualifier
for gl_FragDepth.

Note that ast_type_qualifier::merge_qualifier() will prevent
using duplicate layout qualifiers by returning a compile-time
error.

No piglit regressions found (including compiler tests) with
RX480 on RadeonSI.

v2: use a switch case

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com> (v1)
2017-02-27 19:39:37 +01:00
Samuel Pitoiset
de2727925a glsl: add has_shader_image_load_store()
Preliminary work for ARB_bindless_texture which can interact
with ARB_shader_image_load_store.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-27 19:33:10 +01:00
Samuel Pitoiset
ea8086861f drirc: add force_glsl_version=440 for The Culling
This game uses GLSL 430 but the interpolation qualifiers in
some shaders don't match, which ends up in a link error. GLSL
440 spec removed this restriction, force it.

This fixes the following link error, as well as serious
rendering problems.

error: vertex shader output `out_TEXCOORD1' specifies noperspective
interpolation qualifier, but fragment shader input specifies no
interpolation qualifier

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-27 19:32:55 +01:00
Jason Ekstrand
76c8327e6e anv: Bump advertised version to 1.0.42
We've been following the spec changes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-27 09:44:46 -08:00
Jason Ekstrand
54dd42eb94 vulkan: Update registry and headers to 1.0.42
This brings in a bunch of new extensions
2017-02-27 09:44:45 -08:00
Elie TOURNIER
082d5b1aee nir: Delete unused arg in get_iteration
nir_const_value is not needed in get_iteration

Signed-off-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-27 14:35:16 +00:00
Eric Engestrom
077879cf5e docs: fix a few typos
Noticed a couple, found the rest using vimspell.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-27 14:15:10 +00:00
Grazvydas Ignotas
7f268cf12b gallium/u_queue: set num_threads correctly if not all threads start
If i-th thread could not be created it means we have i threads,
not i+1, because we start from 0.

Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-27 14:49:46 +01:00
Grazvydas Ignotas
9936121935 gallium/u_queue: fix a crash with atexit handlers
Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls
exit()") added a atexit handler which calls
util_queue_killall_and_wait() for each queue to stop the threads.
However the app is also free to use atexit handlers to clean up things,
leading to util_queue_destroy() call which will also call
util_queue_killall_and_wait() for the same queue again, causing threads
being joined twice, and that is undefined. This happens with libglut,
for example. A simple fix is to just set num_threads to 0 as there are
no more valid threads after util_queue_killall_and_wait() returns.

Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-27 14:49:15 +01:00
Bas Nieuwenhuizen
43d833ae97 radv: Use correct size for availability flag.
Per spec, VK_QUERY_RESULT_64_BIT specifies the integer size and the
availability flag is an integer. We apparently handled this correctly
already for the copy to buffer case.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen
8ea34a98c0 radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.
PKT3_OCCLUSION_QUERY hangs when used in a nested IB. This only
calls it when in a primary command buffer and we change
GetQueryPoolResults to not need it. CmdCopyQueryPoolResults
still needs it so we break that behavior for secondary command buffers.
However, that would hang already and using an unitialized value is
better than a hang.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Bas Nieuwenhuizen
bb878db7eb radv: Reset emitted compute pipeline when calling secondary cmd buffer.
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Dave Airlie
15f47027ad radv: add support for NV_dedicated_allocation
This adds initial support for NV_dedicated_allocation, then
uses it for the wsi image/memory allocation paths internally
in the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 00:22:51 +00:00
Andres Rodriguez
35189d3279 radv/winsys: fix freeing imported memory.
This bo->fd wasn't setting some stuff correctly that could
lead to crashes for anything using this path later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 00:22:39 +00:00
Dave Airlie
f695735ed6 vulkan/wsi/radv: add initial prime support (v1.1)
This is a complete rewrite of my previous rfc patches.

This adds the ability to present to a different GPU that rendering
using a driver side operation that can copy from the tiled to
linear shared image.

This does prime support completely in the swapchain present code,
and each queue has a precreated command buffer for each image
and for the each queue family. This means presenting should work
on graphics and compute queues and transfer in the future.

v1.1: initialise needs_linear_copy in swapchain.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-27 05:42:16 +10:00
Bas Nieuwenhuizen
336b05c49a radv/ac: Add integer->integer casts.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-26 19:59:27 +01:00
Eric Engestrom
5b5ffb795f check: add support for running test as standalone
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2017-02-26 13:39:45 +00:00
Eric Engestrom
cd35a119ad check: make any failure fatal
Previously, only the last error code was returned.
Using `set -e` makes the script quit on any unhandled error.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2017-02-26 13:39:43 +00:00
Eric Engestrom
a1e5e55989 check: mark two tests are requiring bash
Requirement was removed just before pushing, but it's actually needed
for heredocs (`<<<`).

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
2017-02-26 13:39:12 +00:00
Mike Lothian
47c49f6190 st/nine: Drop USER_INDEX_BUFFERS check
This fixes 4a883966c1 where the
PIPE_CAP was removed.

Now USER_INDEX_BUFFERS are always enabled remove the check and only
check for cmst_active directly.

v2: Axel pointed out the code was still needed when cmst was inactive,
    Rebase on master too
v3: Drop struct member user_ibufs also && fixup shortlog (Edward).
v4: Fix negation
v5: Use the right variable name csmt != cmst

Fixes: 4a883966c1 ("gallium: remove PIPE_CAP_USER_INDEX_BUFFERS")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99953
Reported-and-tested-by: Vinson Lee <vlee@freedesktop.org> (v1)
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-25 23:20:18 +11:00
Constantine Charlamov
abb1c645c4 st/nine: make use of common uploaders v4
Make use of common uploaders that landed recently to Mesa

v2: fixed formatting, broken due to thunderbird configuration

v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP

v4: per Axel comment: changed style of the comment
2017-02-25 09:31:10 +01:00
Timothy Arceri
6b4bb24acf compiler: style clean-ups in blob.h
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2017-02-25 13:30:28 +11:00
Brian Paul
fcf466383a svga: fix MSVC build error after PIPE_CAP_USER_INDEX_BUFFERS removal
Need to specify the zero for the struct initializer.  My earlier test
of the patch series was with MinGW, not MSVC.

Trivial.
2017-02-24 19:07:10 -07:00
Eric Anholt
292c24ddac vc4: Lazily emit our FS/VS input loads.
This reduces register pressure in both types of shaders, by reordering the
input loads from the var->data.driver_location order to whatever order
they appear first in the NIR shader.  These instructions aren't
reorderable at our QIR scheduling level because the FS takes two in
lockstep to do an interpolation, and the VS takes multiple read
instructions in a row to get a whole vec4-level attribute read.

shader-db impact:
total instructions in shared programs: 76666 -> 76590 (-0.10%)
instructions in affected programs:     42945 -> 42869 (-0.18%)
total max temps in shared programs: 9395 -> 9208 (-1.99%)
max temps in affected programs:     2951 -> 2764 (-6.34%)

Some programs get their max temps hurt, depending on the order that the
load_input intrinsics appear, because we end up being unable to copy
propagate an older VPM read into its only use.
2017-02-24 17:01:29 -08:00
Eric Anholt
f06915d7b7 vc4: Refactor the load_input code out of the intrinsic code.
It's going gain most of ntq_setup_inputs(), so simplify it first.
2017-02-24 16:31:54 -08:00
Eric Anholt
84a304eb96 vc4: Track the last block we emitted at the top level.
This will be used for delaying our VPM reads (which must be unconditional)
until just before they're used.
2017-02-24 16:31:54 -08:00
Eric Anholt
99d4203ad5 vc4: Emit max number of temps in the shader-db output.
We need to be paying attention to optimization's impact on this -- even if
we reduce instruction count, increasing max temps in general is likely to
cause us to fail to register allocate on some shaders, which means that
those won't run at all.
2017-02-24 16:31:54 -08:00
Vinson Lee
30a4b25efe util/disk_cache: Use backward compatible st_mtime.
Fix Mac OS X build error.

  CC       libmesautil_la-disk_cache.lo
In file included from disk_cache.c:46:
./disk_cache.h:57:20: error: no member named 'st_mtim' in 'struct stat'
   *timestamp = st.st_mtim.tv_sec;
                ~~ ^

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99918
Fixes: 207e3a6e4b ("util/radv: move *_get_function_timestamp() to utils")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-24 16:06:40 -08:00
Vinson Lee
c3f9540a0c glsl: Fix missing-braces warning.
CXX    glsl/ast_to_hir.lo
glsl/ast_to_hir.cpp: In member function 'virtual ir_rvalue* ast_declarator_list::hir(exec_list*, _mesa_glsl_parse_state*)':
glsl/ast_to_hir.cpp:4846:42: warning: missing braces around initializer for 'unsigned int [16]' [-Wmissing-braces]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-02-24 16:04:06 -08:00
Marek Olšák
c7878b0167 ac: silence a warning
trivial
2017-02-25 00:16:38 +01:00
Marek Olšák
35915af6c9 radeonsi: fix broken tessellation on Carrizo and Stoney
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
e027935a79 st/mesa: don't update unrelated states in non-draw calls such as Clear
If a VAO isn't bound and u_vbuf isn't enabled because of the Core profile,
we'll get user vertex buffers in drivers if we update vertex buffers
in glClear. So don't do that.

This fixes a regression since disabling u_vbuf for Core profiles.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
cc2f92b09f st/mesa: set blend state for PBO readbacks
v2: restore the state

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
a40b76143d st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
1a36bea445 st/mesa: don't check st->vp in update_clip
The clip state is updated before VS, so it can be NULL for the first draw
call. Just remove the unnecessary dependency on st->vp.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
d17b8d08a3 trace: remove pipe_resource wrapping
Not needed. ddebug does the same thing. The limitation is that drivers
can only use pipe_resource::screen through pipe_resource_reference.

This unbreaks trace, because pipe_context uploaders aren't wrapped,
so trace doesn't understand buffers returned by them.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
4a883966c1 gallium: remove PIPE_CAP_USER_INDEX_BUFFERS
all drivers support it

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>  (VMware driver only)
2017-02-25 00:03:09 +01:00
Marek Olšák
4700f409fb st/mesa: assume all drivers support user index buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>  (VMware driver only)
2017-02-25 00:03:09 +01:00
Marek Olšák
e78ccee933 svga: implement user index buffers
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>  (VMware driver only)
2017-02-25 00:03:09 +01:00
Marek Olšák
7fff5b77f1 freedreno: add support for user index buffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
19c51e072b etnaviv: add support for user index buffers
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-25 00:03:09 +01:00
Marek Olšák
f139b6fb4f gallium/util: add new helpers for user index buffer uploading
v3: split from the etnaviv patch; fix new_ib.buffer leak

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>  (VMware driver only)
2017-02-25 00:03:09 +01:00
Elie TOURNIER
b10197e3a4 nir: delete magic number
Signed-off-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-24 13:02:24 -08:00
Roland Scheidegger
c3a94d9195 gallium/util: (trivial) fix util_clear_render_target
the format of the rt can be different than the one of the texture, so must
propagate the format explicitly to the helper. Broken since
3f9c5d6244 (but unused by st/mesa).
2017-02-24 20:39:56 +01:00
Emil Velikov
9833488974 util: automake: add sha1/README to the tarball
Suggested-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:38:16 +00:00
Emil Velikov
6854716f37 mapi: remove unused mapi.[ch]
The final user of it was st/vega.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-24 17:37:02 +00:00
Emil Velikov
93369aa928 blorp: automake: add TODO to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-24 17:37:00 +00:00
Emil Velikov
ab6fa871ef anv: automake: add TODO to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-24 17:36:59 +00:00
Emil Velikov
aa63b7fa16 vc4: automake: add the kernel/README to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-24 17:36:57 +00:00
Emil Velikov
f64a7c74c3 nir: automake: add the README to the tarball
Similar to other accompanying documentation we have in-tree.
For example glsl/README.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
2017-02-24 17:36:45 +00:00
Emil Velikov
e3ad2d40db radv/entrypoints: Only generate entrypoints for supported features
This changes the way radv_entrypoints_gen.py works from generating a
table containing every single entrypoint in the XML to just the ones
that we actually need.  There's no reason for us to burn entrypoint
table space on a bunch of NV extensions we never plan to implement.

RADV implements VK_AMD_draw_indirect_count, so add that to the list.

Port of 114c281e70
"and/entrypoints: Only generate entrypoints for supported features"

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-02-24 17:36:25 +00:00
Robert Bragg
d1bb7895b9 main/performance_query: s/GLboolean/bool/
Ideally would have caught these when adding the interface but this just
switches a few return types for the INTEL_performance_query backend
interface to bool instead of GLboolean.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:16:11 +00:00
Eric Engestrom
1534fc6d10 eglapi: replace linear entrypoint search with binary search
Tested with dEQP-EGL.functional.get_proc_address.*

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
d25dea0c68 egl: make sure entrypoints list is always sorted
Starting with the next commit, badly sorting this list will break the
eglGetProcAddress().

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
557f3181bf egl: distribute all tests
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
f92fd4d7a8 eglapi: move entrypoints list out to its own file
This will allow us to make sure the list is always sorted in the next
commit.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
2b3cd82e18 eglapi: sort entrypoints list
Let's make that comment true.
If will also be necessary in a couple commits (using bsearch).

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
3b69c4a8e8 eglapi: use macro to map entrypoints to functions
As of the last 3 commits, there's a function for each entrypoint.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
66d5ec5f3f eglapi: add entrypoint for eglClientWaitSyncKHR
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
b7f6f3b3e5 eglapi: add entrypoint for eglDestroySyncKHR
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Eric Engestrom
df7fa30aec eglapi: add entrypoint for eglDestroyImageKHR
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 17:00:50 +00:00
Thomas Hellstrom
7b82efe4ee st/va: Fix up YV12 to NV12 putImage conversion
Use the utility u_copy_nv12_from_yv12 to implement this similarly to
how it's been done in the VPAU state tracker. The old code mixed up
planes and fields and didn't correctly handle video surfaces in
interlaced format.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-02-24 16:44:34 +01:00
Thomas Hellstrom
3a418322ec st/vdpau: Provide YV12 to NV12 putBits conversion v2
mplayer likes putting YV12 data, and if there is a buffer format mismatch,
the vdpau state tracker would try to reallocate the video surface as an
YV12 surface. A virtual driver doesn't like reallocating and doesn't like YV12
surfaces, so if we can't support YV12, try an YV12 to NV12 conversion
instead.

Also advertize that we actually can do the getBits and putBits conversion.

v2: A previous version of this patch prioritized conversion before
reallocating. This has been changed to prioritize reallocating in this version.

Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-02-24 16:44:33 +01:00
Leo Liu
5398d006de configure.ac: check require_basic_egl only if egl enabled
Otherwise the configuration fails when building independant libs
like vdpau, vaapi or omx

Fixes: 1ac40173c2 ("configure.ac: simplify EGL requirements for
drivers dependent on EGL")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-24 09:48:47 -05:00
Eric Engestrom
3cc33e7640 glx: add GLXdispatchIndex sort check
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-24 14:44:58 +00:00
Lars Hamre
caf4252a01 docs: update features.txt for GL_ARB_clear_texture with llvmpipe and softpipe
Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-24 15:41:26 +01:00
Lars Hamre
a876b50b20 softpipe: enable clear_texture with util_clear_texture
Passes all corresponding piglit tests.

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-24 15:41:13 +01:00
Lars Hamre
12f2058b47 llvmpipe: enable clear_texture with util_clear_texture
Passes all corresponding piglit tests.

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-24 15:40:57 +01:00
Lars Hamre
3f9c5d6244 gallium: implement util_clear_texture
v3: have util_clear_texture mirror the pipe function (Roland Scheidegger)
v2: rework util clear functions such that they operate on a resource
    instead of a surface (Roland Scheidegger)

Creates a util_clear_texture function for implementing the GL_ARB_clear_texture
in softpipe and llvmpipe.

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-24 15:40:11 +01:00
Jerome Duval
62e27170a7 haiku/winsys: fix dt prototype args 2017-02-24 14:10:57 +00:00
Jerome Duval
40b0c8666c haiku: build fixes around debug defines 2017-02-24 14:10:57 +00:00
Dave Airlie
ccb70d6f53 radv: add sample mask output support
This adds support to write to sample mask from the fragment shader.

We can optimise this later like radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:53 +10:00
Dave Airlie
8282c5c771 radv/ac: refactor our fmask sample index fixup.
This refactors out the sample index fixup between
txf and image load.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:49 +10:00
Dave Airlie
5e9ead0fa2 radv: fetch sample index via fmask for image coord as well.
This follows the txf_ms code, I can't figure out why amdgpu-pro
doesn't do this in their shaders, they must know someone we don't.

This fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:44 +10:00
Dave Airlie
bdcbe7c76b radv: add sample mask input support
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:35 +10:00
Dave Airlie
58c97a0791 radv: enable location at sample when persample is forced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:30 +10:00
Dave Airlie
fc430c391b radv: fix interpolation at wrong place for offset interp
The code was interpolating at the offset from the sample,
not the offset from the center. Also fix for persample interpolation
modes we should force the pixel center to be at the sample.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:19 +10:00
George Kyriazis
dcac48bfee swr: fix index buffers with non-zero indices
Fix issue with index buffers that do not contain a 0 index.  0 index
can be a non-valid index if the (copied) vertex buffers are a subset of the
user's (which happens because we only copy the range between min & max).
Core will use an index passed in from the driver to replace invalid indices.

Only do this for calls that contain non-zero indices, to minimize performance

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

cost.
2017-02-23 16:36:18 -06:00
George Kyriazis
669d8f626f swr: add fetch shader cache
For now, the cache key is all of FETCH_COMPILE_STATE.

Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-23 16:36:13 -06:00
Timothy Arceri
987d8037ca st/mesa: free shader cache buffer on fallback
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-02-24 09:01:59 +11:00
Timothy Arceri
c24d0aaa9a st/mesa: fix crash in shader cache cased by race condition
If a thread doesn't load GLSL IR from cache but does load TGSI
from cache (that was created by another thread) than it will
crash due to expecting gl_program_parameter_list to have been
restored from the GLSL IR cache and not be null.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-02-24 09:01:59 +11:00
Jason Ekstrand
261092f7d4 anv: Enable MSAA compression
This just enables basic MSAA compression (no fast clears) for all
multisampled surfaces.  This improves the framerate of the Sascha
"multisampling" demo by 76% on my Sky Lake laptop.  Running Talos on
medium settings with 8x MSAA, this improves the framerate in the
benchmark by 80%.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-23 12:10:42 -08:00
Jason Ekstrand
42b10b175d anv/blorp/clear_subpass: Only set surface clear color for fast clears
Not all clear colors are valid.  In particular, on Broadwell and
earlier, only 0/1 colors are allowed in surface state.  No CTS tests are
affected outright by this because, apparently, the CTS coverage for
different clear colors is pretty terrible.  However, when multisample
compression is enabled, we do hit it with CTS tests and this commit
prevents regressions when enabling MCS on Broadwell and earlier.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-23 12:10:42 -08:00
Pohjolainen, Topi
042cc201f2 intel/isl: Apply render target alignment constraints for MCS
v2: Instead of having the same block in isl_gen7,8,9.c add it
    once into isl.c::isl_choose_image_alignment_el() instead.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-02-23 12:10:42 -08:00
Lionel Landwerlin
34e29b2ebd intel/isl: add MCS width constraint 16 samples
v3 (Jason Ekstrand): Add a comment explaining why

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-23 12:10:42 -08:00
Jason Ekstrand
3885375195 intel/isl: Return surface creation success from aux helpers
The isl_surf_init call that each of these helpers make can, in theory,
fail.  We should propagate that up to the caller rather than just
silently ignoring it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-23 12:10:42 -08:00
Kenneth Graunke
e6e8475b0f glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.
OpenGL allows the TCS to be missing and supplies an implicit passthrough
shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec,
cited above in the code).

One open question is how to handle this for ARB_ES3_2_compatibility.
This patch raises the link error for all ES shading language programs,
but it might make sense to base it on the API.  The approach taken in
this patch is more restrictive, but should still allow any valid ES
programs to work in GL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
2017-02-23 11:07:06 -08:00
Samuel Iglesias Gonsálvez
a9c488f285 isl/state: fix assert on raw buffer surface state minimum size
From IVB PRM, SURFACE_STATE::Height:

"For typed buffer and structured buffer surfaces, the number of
 entries in the buffer ranges from 1 to 2^27 . For raw buffer
 surfaces, the number of entries in the buffer is the number of bytes
 which can range from 1 to 2^30."

The minimum value is 1, according to the spec. The spec quote
was already added into the code by 028f6d8317.

Fixes crashing tests under:

dEQP-VK.robustness.buffer_access.*

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-23 11:46:47 +01:00
Iago Toral Quiroga
42b9057447 glsl: enable early_fragment_tests implicitly with post_depth_coverage
From ARB_post_depth_coverage:

   "This extension allows the fragment shader to control whether values in
    gl_SampleMaskIn[] reflect the coverage after application of the early
    depth and stencil tests.  This feature can be enabled with the following
    layout qualifier in the fragment shader:

       layout(post_depth_coverage) in;

    Use of this feature implicitly enables early fragment tests."

And a bit later it also adds:

   "early_fragment_tests" requests that fragment tests be performed before
    fragment shader execution, as described in section 15.2.4 "Early Fragment
    Tests" of the OpenGL Specification. If neither this nor post_depth_coverage
    are declared, per-fragment tests will be performed after fragment shader
    execution."

Fixes:
GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-23 11:21:44 +01:00
Samuel Iglesias Gonsálvez
6ca4347c82 glsl: refactor get_variable_being_redeclared() to return always an ir_variable pointer
It will return the current variable ('var') or the earlier declaration ('earlier') in
case of redeclaration of that variable.

In order to distinguish between both, 'is_redeclaration' boolean will indicate in which
case we are.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-02-23 06:56:45 +01:00
Samuel Iglesias Gonsálvez
a73a618933 glsl: fix heap-use-after-free in ast_declarator_list::hir()
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.

However, the same 'var' is referenced later in ast_declarator_list::hir().

This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.

This error was detected by Address Sanitizer.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
2017-02-23 06:56:16 +01:00
Charmaine Lee
043883647a st/wgl: flush with ST_FLUSH_WAIT before releasing shared contexts
Before releasing a shared context, flush the context
with ST_FLUSH_WAIT to make sure all commands are executed.
This ensures that rendering to any shared resources is completed
before they will be referenced by another context.

Fixes an intermittent flickering with Photoshop. (VMware bug# 1779340)

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-18 09:36:42 -08:00
Charmaine Lee
d793b54c4e st: add ST_FLUSH_WAIT to st_context_flush()
When st_context_flush() is called with ST_FLUSH_WAIT,
the function will return after the fence is completed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-18 09:36:42 -08:00
Dave Airlie
b71e6538a8 radv/ac: handle gs->copy shader clip distances.
This fixes up the clip distance passing between the geometry
shader and the copy shader. It packs the clip and cull distances
into one or two consecutive slots, and avoids wasting space and
make sure the gs output and copy shader input agree on where
things are stored.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-23 15:31:41 +10:00
Dave Airlie
bec584ec0e radv/ac: pass clips properly from vertex->geometry shader stages.
This works out the geometry shader clip/cull inputs separately
to the outputs, and uses that information to read from the ES->GS
ring buffer. It stores the clip/cull distances packed into one
or two slots. It fixes the es output emission and gs input
reading to match.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-23 15:31:37 +10:00
Dave Airlie
c2cfb54f13 radv/ac: rename num clips/cull to output clips/culls
As geom shaders can have different ones on entry and exit.

also move to uint8_t as these are never that big.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-23 15:31:10 +10:00
Dave Airlie
c2ed2685fd vulkan/wsi: move image count to shared structure.
For prime support I need to access this, so move it in advance.

[airlied: fix int->uint32_t]

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-23 15:30:32 +10:00
Timothy Arceri
4711e54336 radeon: fix r600 builds when old version of llvm is present
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-23 14:05:55 +11:00
Dylan Baker
fb26e6c0d4 vulkan: Fix gen_enum_to_str in out of tree builds
In some configurations the util directory is created when building out
of tree, but not others. This patch ensures that it's created.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-and-Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-02-22 17:08:52 -08:00
Jason Ekstrand
1bd0e9ca33 anv/Makefile: Gather all the genX files into one place
While we're here, we also fix the alphabetization of the list of
genx_* files.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-22 15:07:18 -08:00
Timothy Arceri
2f3290ac28 r600/radeonsi: enable glsl/tgsi on-disk cache
For gpu generations that use LLVM we create a timestamp string
containing both the LLVM and Mesa build times, otherwise we just
use the Mesa build time.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
27cecafefd st/mesa: get on-disk shader cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
8239eef2f7 ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
4be98ed5fd gallium: add get_disk_shader_cache() callback
V2: Provide more detail in callback description and add description to
    screen.rst

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
9f506d817e st/mesa: implement a tgsi on-disk shader cache
Implements a tgsi cache for the OpenGL state tracker.

V2: add support for compute shaders

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
b9de1c2e02 st/mesa: add sha1 field to st program structs
This will be used to share the sha1 computed by the tgsi load
function with the tgsi write function.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
0d5130bdd0 st/mesa: move set_prog_affected_state_flags() to st_program.c
We want to use this in the new tgsi shader cache so we move it here
and make it available externally.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-23 09:20:22 +11:00
Timothy Arceri
d258055c8b util/disk_cache: fix bug with deleting old cache dirs
If there was more than a single directory in the .cache/mesa dir
then it would only remove one (or none) of the directories.

Apparently Valgrind was also reporting:
Conditional jump or move depends on uninitialised value

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-23 09:20:22 +11:00
Dylan Baker
8e03250fcf vulkan: Combine wsi and util makefiles
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-22 13:12:02 -08:00
Dylan Baker
e9dcb17962 vulkan/util: Add generator for enum_to_str functions
This adds a python generator to produce enum_to_str functions for
Vulkan from the vk.xml API description. It supports extensions as well
as core API features, and the generator works with both python2 and
python3.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-22 13:12:02 -08:00
Thomas Hellstrom
bda59f6e41 Revert "st/vdpau: Fix multithreading"
This reverts commit f1e5dfbe3c.

For a detailed discussion see
https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2017-02-22 21:50:15 +01:00
Nayan Deshmukh
b8861911c5 vl: u_upload_alloc might fail to allocate buffer in bicubic filter
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-22 21:49:19 +01:00
Marek Olšák
7ce8adad43 gallium: reorder fields in pipe_draw_info
sizeof(struct pipe_draw_info) = 104 -> 88

Also, vertices_per_patch is switched to ubyte, because it can't be more
than 32.

Seemed-reasonable-to: Roland Scheidegger
2017-02-22 20:36:40 +01:00
Marek Olšák
3b04566bba gallium/hud: handle a thread switch for API-thread-busy monitoring
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Marek Olšák
31e7ba7124 gallium/hud: prevent an infinite loop
v2: use UINT64_MAX / 11

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Marek Olšák
24847dd1b5 gallium/u_queue: isolate util_queue_fence implementation
it's cleaner this way.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Marek Olšák
4aea8fe7e0 gallium/u_queue: fix random crashes when the app calls exit()
This fixes:
    vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
    llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
    "Pass ID not registered!"' failed.

v2: use list_head, switch the call order in destroy

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 20:26:39 +01:00
Robert Bragg
a96c9564e3 i965: Implement INTEL_performance_query backend
This adds a bare-bones backend for the INTEL_performance_query extension
that exposes pipeline statistics.

Although this could be considered redundant given that the same
statistics are already available via query objects, they are a simple
starting point for this extension and it's expected to be convenient for
tools wanting to have a single go to api to introspect what performance
counters are available, along with names, descriptions and semantic/data
types.

This code is derived from Kenneth Graunke's work, temporarily removed
while the frontend and backend interface were reworked.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-22 19:16:21 +00:00
Robert Bragg
0e7464f0a9 mesa: Model INTEL perf query backend after query obj BE
Instead of using the same backend interface as AMD_performance_monitor
this defines a dedicated INTEL_performance_query interface that is
modelled more on the ARB_query_buffer_object interface (considering the
similarity of the extensions) with the addition of vfuncs for
initializing and enumerating query and counter info.

Compared to the previous backend, some notable differences are:

- The backend is free to represent counters using whatever data
  structures are optimal/convenient since queries and counters are
  enumerated via an iterator api instead of declaring them using
  structures directly shared with the frontend.

  This is also done to help us support the full range of data and
  semantic types available with INTEL_performance_query which is awkward
  while using a structure shared with the AMD_performance_monitor
  backend since neither extension's types are a subset of the other.

- The backend must support waiting for a query instead of the frontend
  simply using glFinish().

- Objects go through 'Active' and 'Ready' states consistent with the
  query object backend (hopefully making them more familiar). There is
  no 'Ended' state (which used to show that a query has ended at least
  once for a given object). There is a new 'Used' state, set when a
  query is first begun which implies that we are expecting to get
  results back for the object at some point. There's no equivalent to
  the 'EverBound' state since the spec doesn't require there to be a
  limbo state between generating IDs and associating them with an object
  on query Begin.

The INTEL_performance_query and AMD_performance_monitor extensions are
now completely orthogonal within Mesa main (though a driver could
optionally choose to implement both extensions within a unified backend
if that were convenient for the sake of sharing state/code).

v2: (Samuel Pitoiset)
- init PerfQuery.NumQueries in frontend
- s/return_string/output_clipped_string/
- s/backed/backend/ typo
- remove redundant *bytesWritten = 0
v3:
- Add InitPerfQueryInfo for lazy probing of available queries
v4:
- Clean up some internal usage of GL typedefs (Ken)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-22 14:07:09 +00:00
Robert Bragg
d83a33a9de mesa: Separate INTEL_performance_query frontend
To allow the backend interfaces for AMD_performance_monitor and
INTEL_performance_query to evolve independently based on the more
specific requirements of each extension this starts by separating
the frontends of these extensions.

Even though there wasn't much tying these frontends together, this
separation intentionally copies what few helpers/utilities that were
shared between the two extensions, avoiding any re-factoring specific to
INTEL_performance_query so that the evolution will be easier to follow
later.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-22 12:12:27 +00:00
Thomas Hellstrom
ccc8720cf7 gallium/vl: Simplify the matrix filter fragment shader
It looks like it was partly copied from the median filter fragment shader
and unnecessesarily saved a lot of temporary values.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:22:17 +01:00
Thomas Hellstrom
f1e5dfbe3c st/vdpau: Fix multithreading
The vdpau state tracker allows multiple threads access to the same gallium
context simultaneously. We can fix this either by locking the same mutex
each time the context is used or by using a different gallium context for
each mutex domain. Here we do the latter, although I'm not sure that's really
the best option.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:20:37 +01:00
Thomas Hellstrom
bcc9fd378d gallium/vl: Parameter substitution in the csc matrix computation
Makes the code significantly more readable.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:20:07 +01:00
Thomas Hellstrom
4c3fe3257d gallium/vl: Simplify usage of full range matrices
When looking at the full range matrices, it becomes obvious that the difference
between the standard matrices and the full range matrices is that the full
range matrices are multiplied by 1.164. Together with offsetting the y value
with -16/255, this will scale and offset RGB with the desired quantities.

However, the standard SMPTE 240M matrix seems to differ a bit since the
U and V coefficients are only multiplied with 1.138 to get the full range
matrix. This would actually alter the color somewhat so I figure that's an
error. The full range matrix is consistent with Nvidia's VDPAU implementation.

We can also incorporate the ybias in the brightness simplifying the
calculation somewhat.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:19:27 +01:00
Thomas Hellstrom
f01e947cdb gallium/vl Fix brightness matrix description
The brightness matrix doesn't actually match the procamp matrix and
what's calculated in vl_csc_get_matrix.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:18:30 +01:00
Thomas Hellstrom
ec8139e50c gallium/vl: Don't map vertex buffers on creation
It will cause multiple simultaneous maps of the same vertex buffer and
flushed-while-mapped warnings.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:17:51 +01:00
Thomas Hellstrom
f2872bf8c3 gallium/vl: Add sampler views to video filter fragment shaders
Needed for at least the svga driver.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:17:07 +01:00
Thomas Hellstrom
53b4584555 gallium/vl: declare sampler views in compositor shaders
The svga driver relies on the existence of these sampler views.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-22 10:15:16 +01:00
Brian Paul
b87ef9e606 util: fix MSVC build issue in disk_cache.h
Windows doesn't have dlfcn.h.  Protect the code in question
with #if ENABLE_SHADER_CACHE test.  And fix indentation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-02-21 20:54:46 -07:00
Dave Airlie
40e0dbf96c radv: fix typo in the subpass barrier patch.
Fixes: dbb0eaccc radv: handle subpass cache flushes

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-22 02:22:30 +00:00
Rafael Antognolli
d71e1f32c6 i965/gen6+: Enable arb_transform_feedback_overflow_query.
This extension adds new query types which can be used to detect overflow
of transform feedback buffers. The new query types are also accepted by
conditional rendering commands.

v3:
    - s/gen7+/gen6+/ in the relnotes (Jordan Justen)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:32 -08:00
Rafael Antognolli
924a1b90aa i965: Add support for xfb overflow query on conditional render.
Enable the use of a transform feedback overflow query with
glBeginConditionalRender. The render commands will only execute if the
query is true (i.e. if there was an overflow).

Use ARB_conditional_render_inverted to change this behavior.

v4:
    - reuse MI_MATH calcs from hsw_queryob (Kenneth)
    - fallback to software conditional rendering when MI_MATH is not
      available (Kenneth)

v5:
    - check query->Target (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:32 -08:00
Rafael Antognolli
d03ec496ee i965: Add support for xfb overflow on query buffer objects.
Enable getting the results of a transform feedback overflow query with a
buffer object.

v4:
    - hsw_overflow_result_to_gpr0 a public function, so it can be used
      by conditional render. (Kenneth)
    - fix typo grp0/gpr0 (Kenneth)
    - rename load_gen_written_data_to_regs to
      load_overflow_data_to_cs_gprs (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:32 -08:00
Rafael Antognolli
5933ec86fd i965: add plumbing for ARB_transform_feedback_overflow_query.
When querying for transform feedback overflow on one or all of the
streams, store information about number of generated and written
primitives. Then check whether generated == written.

v2:
    - use only SO_PRIM_STORAGE_NEEDED, do not fallback to
      CL_INVOCATION_COUNT. (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:32 -08:00
Rafael Antognolli
a80ebff1b9 mesa: Track transform feedback overflow query objects.
Also update checks on conditional rendering.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:31 -08:00
Rafael Antognolli
273bab26af mesa: Add types for ARB_transform_feedback_oveflow_query.
Add some basic types and storage for the queries of this extension.

v2:
    - update date of extension (Kenneth)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 16:28:31 -08:00
Eric Engestrom
89af6bf2cb gallium/docs: use imgmath instead of pngmath
WARNING: sphinx.ext.pngmath has been deprecated. Please use
	sphinx.ext.imgmath instead.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-22 00:01:08 +00:00
Eric Engestrom
d88a0dffe3 gallium/docs: fix section title formatting
src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-22 00:01:01 +00:00
Eric Engestrom
5aa7fa2bbf gallium/docs: add missing newlines
Without these, mathjax considers these as the continuation of the
previous line.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-22 00:00:57 +00:00
Eric Engestrom
3ae77c912e gallium/docs: add missing math formatting
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-22 00:00:51 +00:00
Eric Engestrom
3a0d2c54cf gallium/docs: fix sublist formatting
src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation.

Sub lists need to be surrounded by a blank line.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-22 00:00:38 +00:00
Timothy Arceri
0441e6bc8b util/disk_cache: create timestamp and gpu_id dirs when MESA_GLSL_CACHE_DIR is used
The make check test is also updated to make sure these dirs are created.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 08:40:14 +11:00
Timothy Arceri
207e3a6e4b util/radv: move *_get_function_timestamp() to utils
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-22 08:40:00 +11:00
Kenneth Graunke
ed6b47f435 docs: Update features.txt and relnotes for GL_ARB_transform_feedback2 2017-02-21 12:38:13 -08:00
Kenneth Graunke
0a7b252c5b i965: Enable ARB_transform_feedback2 on Sandybridge.
The only feature over and above ES 3.0 is DrawTransformFeedback().

We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
order to compute the SVBI value for ResumeTransformFeedback(), at which
point our existing GetTransformFeedbackVertexCount() implementation will
do the trick (though with a stall to CPU map the buffer).

Someday, we could probably implement DrawTransformFeedback() more
efficiently, using the "Load Internal Vertex Count" feature of
3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.

Rumor has it this allows people to use WebGL 2.0 on Sandybridge.

Note that we don't need pipelined register writes like Gen7+ because
we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
0235757422 i965: Properly reset SVBI counters on ResumeTransformFeedback().
This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused
GLES 3.0 test.  When resuming the transform feedback object, we need to
reset the SVBI counters so we continue writing at the correct point in
the buffer.

Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the
Streamed Vertex Buffer Index (SVBI) counters, which contain a count of
vertices emitted.

Unfortunately, there's no straightforward way to store the current SVBI
counter values to a buffer.  They're not available in a register.  You
can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another
internal counter which 3DPRIMITIVE can use...but there's no good way to
extract that either.

So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex
numbers.  Thankfully, we can reuse most of the existing Gen7+ code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
eb0331382a i965: Save max_index in brw_transform_feedback_object.
I'm going to need this in a new Resume hook shortly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
8513090cd7 i965: Update brw_save_primitives_written_counters for pre-Gen7.
Sandybridge and earlier only have a single counter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
42a4f91820 i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.
This way on Sandybridge we'll only do 1 stream worth of math, since
we only have one SO_NUM_PRIMS_WRITTEN counter.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
2af5f0caad i965: Move some code from gen7_sol_state.c to gen6_sol.c.
I plan to use these functions on Sandybridge soon.  I changed the prefix
on a couple of functions to "brw" instead of "gen7" as in theory they
should be usable all the way back to G45.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Kenneth Graunke
bf8dd21191 i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.
These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
are supported, which Gen8+ can always do.  So this code is dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-21 12:38:13 -08:00
Marek Olšák
96cbc1ca29 vbo: kill primitive restart lowering in glDrawArrays
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-21 21:28:02 +01:00
Marek Olšák
63c462226e radeonsi: fix issues with monolithic shaders
R600_DEBUG=mono has had no effect since:

    commit 1fabb29717
    Author: Marek Olšák <marek.olsak@amd.com>
    Date:   Tue Feb 14 22:08:32 2017 +0100

    radeonsi: have separate LS and ES main shader parts in the shader selector

Also, this assertion was failing:
    si_state_shaders.c:1307: si_shader_select_with_key: Assertion
    `!shader->is_optimized' failed.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
52581606c2 radeonsi: set no-signed-zeros-fp-math
Recommended by Matt Arsenault.

46757 shaders in 28742 tests
Totals:
SGPRS: 2068851 -> 2066907 (-0.09 %)
VGPRS: 1604056 -> 1602676 (-0.09 %)
Spilled SGPRs: 1402 -> 1382 (-1.43 %)
Spilled VGPRs: 113 -> 113 (0.00 %)
Private memory VGPRs: 1332 -> 1332 (0.00 %)
Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread
Code Size: 58815520 -> 58716788 (-0.17 %) bytes
LDS: 1162 -> 1162 (0.00 %) blocks
Max Waves: 354616 -> 354905 (0.08 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 786452 -> 784508 (-0.25 %)
VGPRS: 530000 -> 528620 (-0.26 %)
Spilled SGPRs: 958 -> 938 (-2.09 %)
Spilled VGPRs: 85 -> 85 (0.00 %)
Private memory VGPRs: 636 -> 636 (0.00 %)
Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread
Code Size: 26349936 -> 26251204 (-0.37 %) bytes
LDS: 304 -> 304 (0.00 %) blocks
Max Waves: 108962 -> 109251 (0.27 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
fd3e73f54e gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)
v2: define lp_float_mode

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
84e72f2962 radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them
We were unconditionally storing these outputs, sometimes even one component
at a time, but apps never read them in TES.

Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can
easily disable them on demand.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
d633e23192 radeonsi: skip LDS stores in TCS if there are no LDS output reads
This removes a lot of useless LDS stores.

A few games read TESSINNER/OUTER, but not any other outputs. Most games
don't read any outputs.

The only app doing LDS output reads is UE4 Lightsroom Interior.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Marek Olšák
58af0a5385 tgsi/scan: add basic info about tessellation OUT and IN uses
not all of them will be used immediately

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-21 21:27:23 +01:00
Jason Ekstrand
f31ed6d0cd anv: Take a device parameter in anv_state_flush
This allows the helper to check for llc instead of having to do it
manually at all the call sites.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
f408971deb anv: Pull all clflushing into a clflush_range helper
All this cache line address calculation stuff is tricky.  Let's not
duplicate it more places than we have to.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
16b187c8bb anv: Remove the unused state_pool_emit macro
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
f9d7d27d6d anv: Rename clflush_range and state_clflush
It's a bit shorter and easier to work with.  Also, we're about to add a
helper called clflush which does the clflush but without any memory
fencing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
075ed20614 intel/blorp: Explicitly flush all allocated state
Found by inspection.  However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
b6b03329af anv: Put everything about queries in genX_query.c
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
965fad0e8b anv/Makefile: alphabetize
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
40087bcb51 anv/query: Perform CmdResetQueryPool on the GPU
This fixes a some rendering corruption in The Talos Principle

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
dc9abd0e6b genxml: Make MI_STORE_DATA_IMM more consistent
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
3788cd3239 anv/query: clflush the bo map on non-LLC platforms
Found by inspection

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Jason Ekstrand
8582ab2d6e anv: Add an invalidate_range helper
This is similar to clflush_range except that it puts the mfence on the
other side to ensure caches are flushed prior to reading.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-21 12:26:35 -08:00
Christian Gmeiner
e8d600710c etnaviv: remove number of pixel pipes validation
This validation was added before the etnaviv drm driver landed in
the linux kernel. Due some pre-merge API changes we had to fix-up
this value but with a mainline kernel this is not a problem anymore.

Lets remove that validation which also gets rid of problem caught
by Coverity, reported to me by imirkin.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-21 21:14:35 +01:00
Christian Gmeiner
a0b16a0890 etnaviv: move pctx initialisation to avoid a null dereference
In case ctx->stream == NULL the fail label gets executed where
pctx gets dereferenced - too bad pctx is NULL in that case.

Caught by Coverity, reported to me by imirkin.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-21 21:14:27 +01:00
Christian Gmeiner
f709096d0e etnaviv: add missing fallthrough annotation
Caught by Coverity, reported to me by imirkin.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-21 21:14:01 +01:00
Emil Velikov
383e8e2d5d docs/releasing.html: reword "distro breaking changes" hunk
v2: s/rare/rarely/ (Eric)

Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-21 18:39:40 +00:00
Emil Velikov
8b79f0ed08 radv: make radv_resolve_entrypoint static
Used only within the generated source file.

Fixes: 12301c5418 ("radv: drop the RADV_CALL macro.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-02-21 18:31:16 +00:00
Emil Velikov
320561bd83 radv: remove unused radv_dispatch_table dtable
Fixes: 12301c5418 ("radv: drop the RADV_CALL macro.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-02-21 18:31:14 +00:00
Emil Velikov
9807e9dea6 anv: remove unused anv_dispatch_table dtable
Fixes: 4c9dec80ed ("anv: Get rid of the ANV_CALL macro")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-02-21 18:31:04 +00:00
Emil Velikov
aa5baf1d50 i915: remove extern "C" guards
None of this code is used in C++ context.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:43 +00:00
Emil Velikov
0e74f390d9 i915: remove 'virtual' and extern C workarounds
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:41 +00:00
Emil Velikov
3ea07d2be9 i965: remove 'virtual' and extern C workarounds
The headers are properly annotated thus we don't need these.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:38 +00:00
Emil Velikov
8481914681 i965: add extern C notation in headers
Otherwise symbols wont be annotated with C linkage and we'll fail at
link time.

Currently this is worked around by wrapping the header inclusion itself.
The latter in itself fragile and not recommended.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:28 +00:00
Emil Velikov
dafc325f42 gallium: do not #include foo.h within extern C {}
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:25 +00:00
Emil Velikov
e4f971c85f nir: do not #include util/debug.h within extern C {}
It's a problem waiting to happen. Individual headers should be annotated
if needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:17 +00:00
Emil Velikov
7fcbb1a902 glsl: resolve extern C workarounds/hacks
Do not wrap header inclusion in extern C since it can cause issues.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:10 +00:00
Emil Velikov
a177a13033 st/mesa: move extern C wrappers where applicable
Namely, after the include directives. The headers are properly annotated
so keeping things as-is is only asking for trouble.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:07 +00:00
Emil Velikov
94b88c1c75 mesa/tests: remove unneeded extern C { #include foo } hack
The header itself (enums.h) is already properly annotated.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:29:01 +00:00
Emil Velikov
d5db27706c mesa: remove unneeded extern C {} wrapper
compiler.h defines a few mesa specific macros which are not C specific.
This allows us to avoid buggy extern C { #include $system_header }
constructs.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:28:59 +00:00
Emil Velikov
1451bcb125 mesa: annotate functions for C linkage
i.e. add extern C {} in program/symbol_table.h

It will allow us remove a workaround we have elsewhere in the code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:28:55 +00:00
Emil Velikov
e776e0385c anv: remove unneeded extern C notation
Analogous to previous commit - never used in any C++ code.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:28:18 +00:00
Emil Velikov
944620bc0e radv: remove unneeded extern C notation
Header is never #include(d) by a C++ source.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-21 18:28:15 +00:00
Rhys Kidd
4bf9862747 glsl/tests: Add UINT64 and INT64 types
glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_UINT64’ not handled in switch [-Wswitch]
       switch (type->base_type) {
              ^
glsl/tests/uniform_initializer_utils.cpp:83:14: warning: enumeration value ‘GLSL_TYPE_INT64’ not handled in switch [-Wswitch]

Fixes: 8ce53d4a2f ("glsl: Add basic ARB_gpu_shader_int64 types")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2017-02-21 18:03:14 +00:00
Eric Engestrom
6181ab9d77 docs: fix gamma correction link
That link has been dead for 15 years...
We could link to Archive.org [1] to get the last time this page existed,
but I feel like Wikipedia is a better choice.

[1] http://web.archive.org/web/20021211151318/http://www.inforamp.net/~poynton/notes/colour_and_gamma/GammaFAQ.html

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-21 14:54:10 +00:00
Eric Engestrom
b347bbb63b docs: add link to gallium doc
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-21 14:43:29 +00:00
Nicolai Hähnle
066a117be7 radeonsi: fix UINT/SINT clamping for 10-bit formats on <= CIK
The same PS epilog workaround as for 8-bit integer formats is required,
since the CB doesn't do clamping.

Fixes GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels*.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 10:45:13 +01:00
Nicolai Hähnle
6a1d9684f4 radeonsi: handle MultiDrawIndirect in si_get_draw_start_count
Also handle the GL_ARB_indirect_parameters case where the count itself
is in a buffer.

Use transfers rather than mapping the buffers directly. This anticipates
the possibility that the buffers are sparse (once ARB_sparse_buffer is
implemented), in which case they cannot be mapped directly.

Fixes GL45-CTS.gtf43.GL3Tests.multi_draw_indirect.multi_draw_indirect_type
on <= CIK.

v2:
- unmap the indirect buffer correctly
- handle the corner case where we have indirect draws, but all of them
  have count 0.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-21 10:45:02 +01:00
Nicolai Hähnle
550125e1e7 winsys/amdgpu: reduce max_alloc_size based on GTT limits
Allocating huge buffers in VRAM is not a problem, but when those buffers
start being migrated, the kernel runs into errors because it cannot split
those buffer up for moving through GTT.

This should fix intermittent failures of
GL45-CTS.texture_buffer.texture_buffer_max_size

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 10:43:38 +01:00
Bas Nieuwenhuizen
8cff852ae2 radv: Don't flush at the start of a command buffer.
The preamble flushes now and the rest is the responsibility of the app.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:20:03 +01:00
Bas Nieuwenhuizen
5241fb0ffb radv: Flush in the initial preamble CS.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:19:58 +01:00
Bas Nieuwenhuizen
c121739c47 radv: Special case the initial preamble.
For flushing we don't want to flush every third IB.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:19:53 +01:00
Bas Nieuwenhuizen
eac790811b radv: Split emitting the cache flush out.
So that we can use it without a cmd_buffer.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:19:45 +01:00
Bas Nieuwenhuizen
b6e0df2edd radv: Free empty_cs on device destruction.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:18:50 +01:00
Ben Skeggs
8f4483b609 nvc0: use PascalB for most Pascal boards
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-02-21 10:01:16 +10:00
Dave Airlie
6dbb0eaccc radv: handle subpass cache flushes
This splits out the cache flush bit setting code
dependent on the src/dest access flags.

It then calls it from the subpass barrier code.

It also marks a TODO to remove the aggressive CS/PS
flushes at some point.

This fixes a bunch of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:48:37 +10:00
Grazvydas Ignotas
66d1cb587a r300g: only allow byteswapped formats on big endian
They cause regressions on little endian.

Fixes: 172bfdaa9e ("r300g: add support for PIPE_FORMAT_x8R8G8B8_*")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98869
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 00:37:02 +01:00
Timothy Arceri
87687afb94 mesa: remove unused variable warning in release builds
This assert might have made sense before but we no longer use
gl_linked_shader here. Unless the caller has really done something
crazy this assert is fairly useless.

We also do some small tidy ups in this change.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-21 08:46:04 +11:00
Emil Velikov
a40ebe73a1 docs/submittingpatches.html: document the Fixes tag
Provide information and an example.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-20 18:21:22 +00:00
Emil Velikov
9e4248b206 docs/submittingpatches.html: remove version tag for nominations
The version tag used to nominate has bitten even experienced mesa
developers. Not to mention that it deviates from the one used in the
kernel leading to further confusion.

Simplify things and omit it all together.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-20 18:21:22 +00:00
Emil Velikov
f9cdfa33c2 docs/submittingpatches.html: add #backports section
Provide information about merge conflicts resolution and sending
backports.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-20 18:21:22 +00:00
Emil Velikov
d7e0ff0e2b docs/submittingpatches.html: rework the #criteria section
Reword the section to focus on what is allowed, using a more brief, yet
descriptive wording.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-20 18:21:22 +00:00
Emil Velikov
af9a4d9005 travis: bring the scons build on par with AppVeyor
Namely, always build with LLVM and run the check target.

Cc: Rhys Kidd <rhyskidd@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 18:21:22 +00:00
Ben Crocker
3f1b6ef2aa gallivm: Reenable PPC VSX (v3)
Reenable the PPC64LE Vector-Scalar Extension for LLVM versions >= 3.8.1,
now that LLVM bug 26775 and its corollary, 25503, are fixed.

Amendment: remove extraneous spaces in macro def & invocations.

We would prefer a runtime check, e.g. via an LLVMQueryString
(analogous to glGetString, eglQueryString) or LLVMGetVersion API,
but no such API exists at this time.

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
[Emil Velikov: remove LLVM_VERSION macro]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 18:21:22 +00:00
Ben Crocker
b934aae364 gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)
If llvm::sys::getHostCPUName() returns "generic", override
it with "pwr8" (on PPC64LE).

This is a work-around for a bug in LLVM: a table entry for "POWER8NVL"
is missing, resulting in (big-endian) "generic" being returned on
little-endian Power8NVL systems.  The result is that code that
attempts to load the least significant 32 bits of a 64-bit quantity in
memory loads the wrong half.

This omission should be fixed in the next version of LLVM (4.0),
but this work-around should be left in place in case some
future version of POWER<n> also ends up unrepresented in LLVM's table.

This workaround fixes failures in the Piglit arb_gpu_shader_fp64 conversion
tests on POWER8NVL processors.

(V4: add similar comment in the code.)

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 18:21:22 +00:00
Ben Crocker
a8e9c630f3 gallivm: Improve debug output (V2)
Improve debug output from gallivm_compile_module and
lp_build_create_jit_compiler_for_module, printing the
-mcpu and -mattr options passed to LLC.

V2: enclose MAttrs debug_printf block and llc -mcpu debug_printf
in "if (gallivm_debug & <flags>)..."

Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Cc: 12.0 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v2)
[Emil Velikov: rebase]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 18:21:22 +00:00
Marek Olšák
e8c2a05662 gallium/u_suballoc: update comments
as requested by Brian. Trivial.
2017-02-20 18:04:27 +01:00
Jonathan Gray
a042465c21 util/build-id: define ElfW and NT_GNU_BUILD_ID if needed
Define ElfW() and NT_GNU_BUILD_ID if needed as these defines are not
present on at least OpenBSD and FreeBSD.  Fixes the build on OpenBSD.

Fixes: d4fa083e11 ("util: Add utility build-id code.")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 16:39:24 +00:00
Mauro Rossi
41b5620492 android: define HAVE_DL_ITERATE_PHDR for build-id code
Required due to d4fa083 "util: Add utility build-id code."
to avoid following build error and warnings:

external/mesa/src/intel/vulkan/anv_device.c:60:32: error: incompatible integer to pointer conversion initializing 'const struct build_id_note *' with an expression of type 'int' [-Werror,-Wint-conversion]
   const struct build_id_note *note = build_id_find_nhdr("libvulkan_intel.so");
                               ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/mesa/src/intel/vulkan/anv_device.c:64:19: warning: implicit declaration of function 'build_id_length' is invalid in C99 [-Wimplicit-function-declaration]
   unsigned len = build_id_length(note);
                  ^
external/mesa/src/intel/vulkan/anv_device.c:68:4: warning: implicit declaration of function 'build_id_read' is invalid in C99 [-Wimplicit-function-declaration]
   build_id_read(note, uuid, VK_UUID_SIZE);
   ^
3 warnings and 1 error generated.
[ 40% 1438/3588] target  C: libmesa_vulkan_common_32 <= external/mesa/src/intel/vulkan/anv_image.c
ninja: build stopped: subcommand failed.
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed
make: *** [ninja_wrapper] Error 1

Fixes: d4fa083e11 ("util: Add utility build-id code.")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 16:33:03 +00:00
Mauro Rossi
9e3d66c1e5 android: glsl: build shader cache sources
Fixes the following building errors:

external/mesa/src/compiler/glsl/linker.cpp:4642: error: undefined reference
 to 'shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'
external/mesa/src/mesa/program/ir_to_mesa.cpp:3135: error: undefined reference
 to 'shader_cache_write_program_metadata(gl_context*, gl_shader_program*)'
clang++: error: linker command failed with exit code 1
...
external/mesa/src/mesa/program/ir_to_mesa.cpp:3135: error: undefined reference
 to 'shader_cache_write_program_metadata(gl_context*, gl_shader_program*)'
external/mesa/src/compiler/glsl/linker.cpp:4642: error: undefined reference
 to 'shader_cache_read_program_metadata(gl_context*, gl_shader_program*)'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed
make: *** [ninja_wrapper] Error 1

Fixes: 9f8dc3bf03 ("utils: build sha1/disk cache only with
Android/Autoconf")
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 16:30:37 +00:00
Mauro Rossi
933988901a android: radeonsi: fix sid_table.h generated header include path
generated-sources-dir-for macro replaces intermediates-dir-for
and LOCAL_MODULE_CLASS is defined as required by new macro,
in order to avoid the following building error:

external/mesa/src/gallium/drivers/radeonsi/si_debug.c:29:10: fatal error: 'sid_tables.h' file not found
         ^
1 error generated.

Fixes: 730574c58e ("android: ac/debug: move sid_tables.h generation and
IB decode to amd/common")
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 16:23:13 +00:00
Emil Velikov
920b4d537f docs: add news item and link release notes for 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-20 11:56:39 +00:00
Emil Velikov
85acb42522 docs: add sha256 checksums for 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 112e75f51b)
2017-02-20 11:55:10 +00:00
Emil Velikov
2b06e91ded docs: add release notes for 13.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 71f3ff57fa)
2017-02-20 11:55:09 +00:00
Dave Airlie
0a44a680ff vulkan/wsi/x11: add support to detect if we can support rendering (v3)
This adds support to radv_GetPhysicalDeviceXlibPresentationSupportKHR
and radv_GetPhysicalDeviceXcbPresentationSupportKHR to check if the
local device file descriptor is compatible with the descriptor
retrieved from the X server via DRI3.

This will stop radv binding to an X server until we have prime
support in place. Hopefully apps use this API before trying
to render things.

v2: drop unneeded function, don't leak memory. (jekstrand)
v3: also check in surface_get_support callback.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-20 12:53:52 +10:00
Dave Airlie
1f6376935b Revert "radv: detect command buffers that do no work and drop them (v2)"
This just keeps popping up minor problems and regressions we should
revisit in a more sustainable manner later.

This also reverts:
Revert "radv: query cmds should mark a cmd buffer as having draws."
Revert "radv: also fixup event emission to not get culled."

This reverts commit d1640e7932.
This reverts commit 8b47b97215.
This reverts commit b4b19afebe.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-20 09:00:40 +10:00
Bas Nieuwenhuizen
81b2379664 radv: Handle VK_REMAINING_ARRAY_LAYERS in fast clear eliminate.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-19 20:58:06 +01:00
Marek Olšák
c8ef512398 gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally
It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-19 17:16:26 +01:00
Marek Olšák
a264fee624 radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)
start can only be non-zero with MultiDrawElements, which is unlikely
to occur with UNSIGNED_BYTE indices.

v2: Also fix the util_shorten_ubyte_elts_to_userptr call.
    Tested with the new piglit.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-19 17:16:26 +01:00
Dave Airlie
9aec76aca3 radv: handle layered fast clears.
This iterates the fast clear flush across the layers in the
specified range.

It also moves the compute resolve flush into the function
and builds the range in there.

This fixes:
dEQP-VK.geometry.layered.* regressions since fast clears.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-02-19 20:30:01 +10:00
Dave Airlie
efc89edf5a radv: pass subresourceRange by pointer.
This struct is 5 dwords, we should really just pass a pointer
to it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-19 20:28:22 +10:00
Dave Airlie
2b3c490e23 radv: fix typo in a2b10g10r10 fast clear calculation.
This fixes:
dEQP-VK.renderpass.formats.a2b10g10r10_unorm_pack32*
regressions.

Fixes:
f22836dbdd radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-02-19 20:27:28 +10:00
Bas Nieuwenhuizen
c7fcaf2314 radv: Invert ring SGPR check.
I assume this wants to check if all pipelines use the same SGPR for
the rings.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-19 10:13:11 +01:00
Bas Nieuwenhuizen
e12cf3f9bf radv: Clamp framebuffer dimensions to min. attachment dimensions.
Even though the preferred stance is not to fix incorrect applications
via the driver, this prevents some nasty GPU hangs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-19 10:13:01 +01:00
Marek Olšák
ad019bf5c6 gallium: remove TGSI_OPCODE_CLAMP
Not used and not widely supported. Use MIN+MAX instead.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
675ef9c0c7 ac/llvm: use min+max instead of AMDGPU.clamp on LLVM 5.0
It selects v_med3_f32, which has the same rate & size.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
660b55e6d9 radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
73d1c8c686 tgsi/lowering: stop using TGSI_OPCODE_CLAMP
v2: do it correctly

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
1d1b769561 st/mesa: stop using TGSI_OPCODE_CLAMP
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák
45240ce598 radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
a41587433c gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
9434421213 gallium/radeon: change r600_aligned_buffer_create to take flags, not bind
All call sites set bind = 0. The next commit will use this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
ac6007460a radeonsi: upload constants into VRAM instead of GTT
This lowers lgkm wait cycles by 30% on VI and normal conditions.
The might be a measurable improvement when CE is disabled (radeon)
or under L2 thrashing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
a550fbb510 gallium/radeon: use TCC line size as alignment in other places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
791e8ce04a radeonsi: use a clever alignment for index buffer uploads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
d6c8c26851 radeonsi: use a clever alignment for descriptor uploads
Non-VBO descriptors won't be smaller than the cache line, so simply use
the cache line size.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
6b73aafceb radeonsi: use a clever alignment for constant buffer uploads
This results in a very tiny decrease in lgkm wait cycles.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
620aded541 radeonsi: move index buffer flushing into a non-upload indexed case
The other codepaths don't need this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
22b8a773e1 radeonsi: use SI_MAX_ATTRIBS where it should be used
for consistency; no change in behavior

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
054f853035 radeonsi: sort members of si_shader_key::part
and improve some comments

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
1fabb29717 radeonsi: have separate LS and ES main shader parts in the shader selector
This might reduce the on-demand compilation if the initial VS/LS/ES
determination is wrong.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
a02117ba6e radeonsi: don't compile pure monolithic shaders asynchronously
there is no point, we have to wait anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
9b91e0b54c radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI
So that we can disable u_vbuf for GL core profiles.

This is a v2 of the previous VI-only patch.
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
2fb021b620 radeonsi: remove the fix_size3 workaround
not needed with the shader fallback

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
dbd38f2a92 radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
41a2157a68 radeonsi: make fix_fetch an array of uint8_t
so that we can add 3-component fallbacks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
f246ae1ee9 vl: fix a buffer leak in the bicubic filter by using an uploader
there's no error checking, because the previous code didn't do it either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
c8d84801b7 gallium/hud: create files after graphs are created to get final names
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
22c34bbc55 gallium/u_suballoc: allow setting pipe_resource::flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
edf6bcf6c6 gallium/u_suballoc: use clear_buffer if available
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
02cd8b20d1 gallium/util: correctly unref a buffer in u_prim_restart
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
42297c862f gallium/util: remove unused u_index_modify helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
7f0bf00dc9 gallium/util: remove unused helper util_draw_texquad
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák
b5b0936677 gallium/docs: remove documentation of non-existent instructions
trivial
2017-02-18 01:22:08 +01:00
Jason Ekstrand
5f02c2a054 anv/TODO: Check off Storage Image Without Format
The code for this landed a few days ago.
2017-02-17 14:18:34 -08:00
Marek Olšák
edd23e0606 ac/llvm: fix various findMSB bugs
sffbh needs to be suffixed with ".i32"

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-18 06:24:32 +10:00
Jose Maria Casanova Crespo
429f112a11 glsl: link error if unsized array not-last in ssbo
If an unsized declared array is not the last in an SSBO
and an implicit size can not be defined on linking time,
the linker should raise an error instead of reaching
an assertion on GL.

This reverts part of commit 3da08e1664
getting back to the behavior of commit 5b2675093e

The original patch was correct for GLES that should produce
a compile-time error but the linker error is still necessary
in desktop GL.

Fixes the following piglit tests:
tests/spec/arb_shader_storage_buffer_object/non_integral_size_array_member.shader_test
tests/spec/arb_shader_storage_buffer_object/unsized_array_member.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2017-02-17 15:49:16 +02:00
Lionel Landwerlin
a0ac118398 i965/fs: fix uninitialized memory access
Found while running shader-db under valgrind.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-17 10:07:56 +00:00
Timothy Arceri
62c90492ef glsl: disable on disk shader cache when running as another user
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 20:21:22 +11:00
Alejandro Piñeiro
966ddd5d3d mesa/formatquery: use consistent local function names
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-02-17 10:17:54 +01:00
Bas Nieuwenhuizen
d5bf4c7394 radv: Use different allocator for descriptor set vram.
This one only keeps allocated memory in the list, and list nodes
in the descriptor sets. Thsi doesn't need messing around with
max_sets, and we get automatic merging of free regions.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-17 09:28:23 +01:00
Bas Nieuwenhuizen
f448701622 radv: Never try to create more than max_sets descriptor sets.
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
2017-02-17 09:28:14 +01:00
Samuel Iglesias Gonsálvez
fccbad73ef i965/fs: fix 32-bit data type to int64 conversion on BSW/BXT
The 32-bit to 64-bit conversions need to have the 32-bit
data source elements aligned to 64-bit but only with doubles as
destination type.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-17 06:50:22 +01:00
Timothy Arceri
172c48cc15 glsl: fix scons builds with shader cache
For now its disabled for scons so wrap glsl cache calls in a
define conditional.
2017-02-17 16:31:47 +11:00
Timothy Arceri
a2bf0954fb util/disk_cache: fix typo in function stub 2017-02-17 15:54:00 +11:00
Jason Ekstrand
b073811617 i965/fs: Remove hand-coded 64-bit packing optimizations
The optimization in unpack_64 is clearly subsumed with the opt_algebraic
optimizations in the previous commit.  The pack optimization may not be
quite handled by opt_algebraic but opt_algebraic should get the really
bad cases.  Also, it's been broken since it was merged and we've never
noticed so it must not be doing anything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand
70e86a3f2d nir/algebraic: Optimize 64bit pack/unpack
This reduces the instruction count in some fp64 and int64 piglit tests

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand
e10f522cd7 nir: Rename lower_double_pack to lower_64bit_pack
There's nothing "double" about it other than, perhaps, the fact that it
packs two 32-bit values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand
161d3e81be nir: Combine the int and double [un]pack opcodes
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing.  There's no reason to have two versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand
a4393bd97f i965/fs: Fix the inline nir_op_pack_double optimization
We can only do the optimization if the source *is* SSA.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-16 17:28:03 -08:00
George Kyriazis
e2abe80bee swr: remove unneeded extern "C"
the guards have been added to the header files that needed them.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-16 18:22:27 -06:00
George Kyriazis
d4b4a511f6 gallium: add extern "C" guards
Added extern "C" __cplusplus guards on headers that did not have them.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-16 18:22:27 -06:00
Timothy Arceri
a3ab09f90f util/disk_cache: check cache exists before calling munmap()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
512c046edd util/disk_cache: add support for removing old versions of the cache
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
3342ce452c util/disk_cache: allow drivers to pass a directory structure
In order to avoid costly fallback recompiles when cache items are
created with an old version of Mesa or for a different gpu on the
same system we want to create directories that look like this:

./{TIMESTAMP}_{LLVM_TIMESTAMP}/{GPU_ID}

Note: The disk cache util will take a single timestamp string, it is
up to the backend to concatenate the llvm string with the mesa string
if applicable.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
87009681a5 mesa: remove cache creation from _mesa_initialize_context()
We will change the way we create the cache directory in the following
patches.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
6602d0401c st/mesa/glsl: build string of dri options and use as input to building sha for shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
ed61530121 glsl: reserve parameter storage on cache restore
Since we know how big the list will be we can allocate the storage
upfront.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
1183eb487f glsl: don't try to load/store buffer object values in the cache
Also add an assert to catch buffer overflows.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
cad1a9bfde glsl: don't reprocess or clear UBOs on cache fallback
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
01d1e5a7ad glsl: skip more uniform initialisation when doing fallback linking
We already pull these values from the metadata cache so no need to
recreate them.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
794f7326bc glsl: don't lose uniform values when falling back to full compile
Here we skip the recreation of uniform storage if we are relinking
after a cache miss. This is improtant because uniform values may
have already been set by the application and we don't want to reset
them.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
0e9991f957 glsl: don't reference shader prog data during cache fallback
We already have a reference.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
2f19accc5e mesa/glsl: add cache_fallback flag to gl_shader_program_data
This will allow us to skip certain things when falling back to
a full recompile on a cache miss such as avoiding reinitialising
uniforms.

In this change we use it to avoid reading the program metadata
from the cache and skipping linking during a fallback.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
e3adde023b glsl: add api and glsl version to hash generation for shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
dc0c0c176d glsl: cache uniform values
These may be lowered constant arrays or uniform values that we set before linking
so we need to cache the actual uniform values.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
49f3439089 glsl: make uniform values helper available for use elsewhere
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
bb16cf805d glsl: cache some more image metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
a3ff840d05 glsl: add support for caching atomic buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3d15d814c0 glsl: add shader cache support for buffer blocks
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
6761259958 glsl: store subroutine remap table in shader cache
V2: use new helpers to store/restore table entries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
787535fb11 glsl: add support for caching subroutines
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
0057de58f9 glsl: add support for caching shaders with xfb qualifiers
For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3bbfee3cd3 glsl: add shader cache support for samplers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
c4cff5f402 glsl: add basic support for resource list to shader cache
This initially adds support for simple uniforms and varyings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3c45d8f464 glsl: fix uniform remap table cache when explicit locations used
V2: don't store pointers use an enum instead to flag what should be
restored. Also do the work in a helper that we will later use for
the subroutine remap table.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Carl Worth
a01973a784 glsl: Serialize three additional hash tables with program metadata
The three additional tables are AttributeBindings, FragDataBindings,
and FragDataIndexBindings.

The first table (AttributeBindings) was identified as missing by
trying to test the shader cache with a program that called
glGetAttribLocation.

Many thanks to Tapani Pälli <tapani.palli@intel.com>, as it was review
of related work that he had done previously that pointed me to the
necessity to also save and restore FragDataBindings and
FragDataIndexBindings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
e5bb4a0b0f glsl: use correct shader source in case of cache fallback
The scenario is:

glShaderSource
glCompileShader <-- deferred due to cache hit of shader

glShaderSource <-- with new source code

glAttachShader
glLinkProgram <-- no cache hit for program

At this point we need to compile the original source when we
fallback.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
8771940682 glsl: make use of on disk shader cache
The hash key for glsl metadata is a hash of the hashes of each GLSL
source string.

This commit uses the put_key/get_key support in the cache put the SHA-1
hash of the source string for each successfully compiled shader into the
cache. This allows for early, optimistic returns from glCompileShader
(if the identical source string had been successfully compiled in the past),
in the hope that the final, linked shader will be found in the cache.

This is based on the intial patch by Carl.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
34ca0fce22 glsl: add initial implementation of shader cache
This uses disk_cache.c to write out a serialization of various
state that's required in order to successfully load and use a
binary written out by a drivers backend, this state is referred to as
"metadata" throughout the implementation.

This initial version is intended to work with all stages beside
compute.

This patch is based on the initial work done by Carl.

V2: extend the file's doxygen comment to cover some of the
design decisions.

V3:
- skip cache for fixed function shaders
- add int64 support
- fix glsl IR program parameter caching/restore and cache the
  parameter values which are used by gallium backends.
- use new link status enum

V4:
- add compute program support

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Dave Airlie
b0232d98e9 radeonsi: use shared emit_umsb helper.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:16 +00:00
Dave Airlie
ebed22ec67 radv/ac: use shared umsb helper.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:16 +00:00
Dave Airlie
0ec66b9969 radeon/ac: add emit umsb shared code.
Since we shared imsb, makes sense to share umsb.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:16 +00:00
Dave Airlie
4617ad07e0 radeon/ac: use llvm.amdgcn.sffbh intrinsic instead of AMDGPU.flbit.i32
Use the newer intrinsic.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:16 +00:00
Dave Airlie
e933331cd7 radeonsi: use shared emit imsb code.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:15 +00:00
Dave Airlie
fb15a1e9dd radv/ac: use shader imsb emission code.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:15 +00:00
Dave Airlie
cae1ff1a4b radeon/ac: add ac_emit_imsb helper.
We want to use a different intrinsic on newer llvm, so move this
code to a shared area.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 22:57:15 +00:00
Emil Velikov
40bf7ba023 egl: _eglFilterArray's filter is always non-null
Drop the extra handling and assert() if things change in the future.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:27:20 +00:00
Emil Velikov
b8ae2fe3e6 docs: add hyperlink to the releasing documentation
Other files such as xlibdriver.html and versions.html explicitly left
out, for now.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:25:02 +00:00
Emil Velikov
cadf174866 util/disk_cache: do not allow space in MESA_GLSL_CACHE_MAX_SIZE
No other env var used in mesa allows for space in the variable contents.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-02-16 15:22:17 +00:00
Emil Velikov
350e8e821f configure.ac: remove unneeded trailing semicolon
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
78c747e820 r100: use correct libdrm_radeon macro
Remove local definition of RADEON_INFO_TILE_CONFIG and use the correct
macro provided by libdrm_radeon RADEON_INFO_TILING_CONFIG.

Latter was present as of libdrm 2.4.22, sirca 2010.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
c8f1f2dc2d winsys/radeon: remove fall-back defines
Provided by libdrm as of last commit.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
f3637b3a1e configure.ac: bump LIBDRM_RADEON requirement to 2.4.71
Such that we can remove all the local fall-back definitions and use the
official UABI ones.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
389478c4e9 bin/get-fixes-pick-list.sh: add new script
The script parses the "Fixes" tags and nominates respective commit if
applicable.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
f1b0b75099 bin/get-pick-list.sh: remove ancient way of nominating patches
The old way of nominating patches [NOTE: .*[Cc]andidate] was
deprecated and has been unused for approx. 3 years.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
d6b1d11d4f bin/get-pick-list.sh: limit `git grep ...' only as needed
Analogous to previous commit.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
d292f12d94 bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed
The currently used range HEAD..origin/master is far too broad. It looks
for nominations within the already_landed list (branchpoint..HEAD).

Similarly we look for already_landed whiting the [possible] nominations
Rand branchpoint..origin/master.

Improve things by limiting the look ups to the branch point.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
71e00d62ed bin/get-extra-pick-list: rework to use already_picked list
Currently we loop (git log --grep) to check if the fix has landed. We
can simplify and make things faster by storing the already_picked list
and grep ping through it.

Slim down the message while we're here.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:52 +00:00
Emil Velikov
cb1947eac7 bin/get-extra-pick-list: use git merge-base to get the branchpoint
Since mesa development history is linear and the only diversion is at
the branchpoint. Thus we can drop the ad-hoc parsing and use git
merge-base to retrieve it.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:51 +00:00
Emil Velikov
1c0a536a72 docs: provide some tips where to obtain Mesa binaries
Mention the generic channels (PPA, Corp, other) as well as give a couple
of examples. Even if the latter became out of date the former should a
be good guide.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:51 +00:00
Emil Velikov
99266ec3ce docs/submittingpatches: assorted grammar fixes
Cc: Ben Crocker <bcrocker@redhat.com>
Suggested-by: Ben Crocker <bcrocker@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 15:17:51 +00:00
Emil Velikov
e280a6bc8a docs/releasing: update the website section
Things are automated via git hooks.

Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
---
Guys, let me know when things are in place.
2017-02-16 15:17:51 +00:00
Emil Velikov
652e367d5f docs/releasing: tweak the glxinfo/glxgear/etc. command lines
Print only the information needed. Namely:
*info: the DRI module picked and the vendor/renderer strings
*gears: everything but the "...configuration file..." line(s)

v2: (Eric) Use "2>&1 |" over "|&", properly escape &.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-16 15:17:51 +00:00
Emil Velikov
f9b18d5acc docs/releasing: build test the scons/mingw build
We had multiple cases in the past where files used only by the
Scons/MinGW/Windows build were missing.

Avoid such instances and add a step to catch them early.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-16 15:17:51 +00:00
Dave Airlie
03f4982c68 nir: handle some 64-bit integer conversions
These are enough for the spir-v generator to handle UConvert
and SConvert operations, and fix the 4 tests in CTS.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:13:21 +10:00
Dave Airlie
adb9555794 nir: handle 64-bit integer types in glsl->nir type conversion.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:13:14 +10:00
Dave Airlie
14167080e2 spirv: handle SpvOpUConvert in proper place.
This was falling into the quantizetof16 path.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:59 +10:00
Dave Airlie
2d0b145902 spirv: add support for Int64 capability
This just adds the support at the spirv->nir level for the Int64
cap.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:13 +10:00
Dave Airlie
48ebdbecc5 spirv/nir: add support for int64
This adds the spirv->nir conversion for int64 types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:05 +10:00
Dave Airlie
7593f2ac1b nir/types: add C accessors for 64-bit integer types.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:10:45 +10:00
Dave Airlie
b292e662fc radv: add fast color clear for b10g11r11
This is used in DOOM, so provide the fast clear path for it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:09:15 +10:00
Timothy Arceri
e6506b3cd2 mesa: retain gl_shader_programs after glDeleteProgram if they are in use
Fixes regressions from c505d6d852.

Switching from using gl_shader_program to gl_program for the pipline
objects CurrentProgram array meant we were freeing gl_shader_programs
immediately after glDeleteProgram was called, but the spec states
the program should only get deleted once it is no longer in use.

To work around this we add a new ReferencedPrograms array to track
gl_shader_programs in use.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 15:01:41 +11:00
Timothy Arceri
300900516d mesa: remove tabs in dri xmlconfig.c
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 14:47:13 +11:00
Timothy Arceri
703b592f7a mesa: style fixes for dri xmlconfig.c
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-16 14:47:13 +11:00
Chris Wilson
ed442ee39b i965: Do not use purged bo after calling glObjectUnpurgeable
If the buffer has been freed by the kernel under memory pressure, it is
invalid to try and access the backing storage for that buffer in the
future - the backing storage is not recreated automatically. As such we
need to mark the GL object as being freed for unretained buffers and so
recreate the object on next use.

Futhermore from the GL_APPLE_object_purgeable:

    "In contrast, by calling ObjectUnpurgeableAPPLE with an <option> of
    UNDEFINED_APPLE, the application is indicating that it intends to
    recreate the contents of the storage from scratch.  Further, the
    application is is stating that it would like the GL to do only the
    minimal amount of work set PURGEABLE_APPLE to FALSE.   If
    ObjectUnpurgeableAPPLE is called with the <option> set to
    UNDEFINED_APPLE, then ObjectUnpurgeableAPPLE will return the value
    UNDEFINED_APPLE."

we must always report GL_UNDEFINED_APPLE when called with
glObjectUnpurgeable(GL_UNDEFINED_APPLE).

Testcase: piglit/object_purgeable-api-*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-15 17:00:42 -08:00
Matt Turner
a1891da7c8 Revert "i915: Always enable GL 2.0 support."
This partially reverts commit 97217a40f9.
It leaves ES 2.0 support in place per Ian's suggestion, because ES 2.0
is designed to work on hardware like i915.

Chrome only uses the GPU if you have GL >= 2.0, and using i915 (and
prog_execute) actually hurt performance compared with the software
paths.
2017-02-15 14:52:27 -08:00
Matt Turner
656e30b686 anv: Use build-id for pipeline cache UUID.
The --build-id=... ld flag has been present since binutils-2.18,
released 28 Aug 2007.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-15 13:59:51 -08:00
Matt Turner
d4fa083e11 util: Add utility build-id code.
Provides the ability to read the .note.gnu.build-id section of ELF
binaries, which is inserted by the --build-id=... flag to ld.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-15 13:59:51 -08:00
Bas Nieuwenhuizen
4e6095ff61 radv: Add support for shaderStorageImageReadWithoutFormat.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-15 21:18:21 +01:00
Bas Nieuwenhuizen
501a4c0d73 spirv: Add support for SpvCapabilityStorageImageReadWithoutFormat.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-15 21:18:18 +01:00
Bas Nieuwenhuizen
53873697e4 radv: Add support for shaderStorageImageWriteWithoutFormat.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-15 21:18:13 +01:00
Eduardo Lima Mitev
633c959fae getteximage: Return correct error value when texure object is not found
glGetTextureSubImage() and glGetCompressedTextureSubImage() are currently
returning INVALID_OPERATION error when the passed texture argument does not
correspond to an existing texture object. However, the error should be
INVALID_VALUE instead. From OpenGL 4.5 spec PDF, section '8.11. Texture
Queries', page 236:

    "An INVALID_VALUE error is generated if texture is not the name of
     an existing texture object."

Same wording applies to the compressed version.

The INVALID_OPERATION error is coming from the call to
_mesa_lookup_texture_err(). This patch uses _mesa_lookup_texture() instead
and emits the correct error in the caller.

Fixes: GL45-CTS.get_texture_sub_image.errors_test

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 19:37:21 +01:00
Jason Ekstrand
a9a517f530 util: Fix a typo in Makefile.sources
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-15 10:27:42 -08:00
Lionel Landwerlin
569231c55e i965: define default allow_higher_compat_version value
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 9d16f3903e ("driconf: add allow_higher_compat_version option")
2017-02-15 17:03:31 +00:00
Samuel Pitoiset
124d9dd57f drirc: add allow_higher_compat_version for Tropico 5
v2: s/force_compat_profile/allow_higher_compat_version

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 16:15:54 +01:00
Samuel Pitoiset
76c6d85cbd drirc: add allow_higher_compat_version for Crookz - The Big Heist
v2: s/force_compat_profile/allow_higher_compat_version

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 16:15:54 +01:00
Samuel Pitoiset
34d587abc2 drirc: add allow_higher_compat_version for Worms WMD
v2: s/force_compat_profile/allow_higher_compat_version

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 16:15:54 +01:00
Samuel Pitoiset
9d16f3903e driconf: add allow_higher_compat_version option
Mesa currently doesn't allow to create 3.1+ compatibility profiles
mainly because various features are unimplemented and bugs can
happen.

However, some buggy apps request a compat profile without using
any old features unimplemented in mesa, and they fail to start.

This option should help some games to run but it's not enough
for all (eg. Dying Light).

v2: - s/force_compat_profile/allow_higher_compat_version

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 16:15:32 +01:00
Marek Olšák
d1fae627fa gallium/radeon: add a HUD query for monitoring the CS thread activity
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-15 14:35:52 +01:00
Lionel Landwerlin
0fcb92c17d anv: wsi: report presentation error per image request
vkQueuePresentKHR() takes VkPresentInfoKHR pointer and includes a
pResults fields which must holds the results of all the images
requested to be presented. Currently we're not filling this field.

Also as a side effect we probably want to go through all the images
rather than stopping on the first error.

This commit also makes the QueuePresentKHR() implementation return the
first error encountered.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-15 11:43:05 +00:00
Eric Engestrom
fc9b119013 egl: remove duplicate 0 assignment
The memset on the line before already takes care of this.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-15 08:57:05 +00:00
Hans de Goede
4c66f529a8 glx/glvnd: Fix GLXdispatchIndex sorting
Commit 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
fixed the sorting of the array initializers in g_glxglvnddispatchfuncs.c
because FindGLXFunction's binary search needs these to be sorted
alphabetically.

That commit also mostly fixed the sorting of the DI_foo defines in
g_glxglvnddispatchindices.h, which is what actually matters as the
arrays are initialized using "[DI_foo] = glXfoo," but a small error
crept in which at least causes glXGetVisualFromFBConfigSGIX to not
resolve, breaking games such as "The Binding of Isaac: Rebirth" and
"Crypt of the NecroDancer" from Steam not working and possible causes
other problems too.

This commit fixes the last of the sorting errors, fixing these mentioned
games not working.

Fixes: 8bca8d89ef ("glx/glvnd: Fix dispatch function names and indices")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-15 09:55:57 +01:00
Dave Airlie
b4b19afebe radv: also fixup event emission to not get culled.
This is possibly a bad idea, I might have to consider a better one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 00:36:30 +00:00
Jason Ekstrand
bfbb362601 anv: Use vk_foreach_struct for handling extension structs
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-14 16:15:39 -08:00
Jason Ekstrand
f76584e7b7 util: Add helpers for iterating over Vulkan extension structs
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-14 16:15:39 -08:00
Dave Airlie
d1640e7932 radv: query cmds should mark a cmd buffer as having draws.
This fixes a regression with the remove non-draw cmd buffers in
queries.

Fixes: 8b47b97215 radv: detect command buffers that do no work and drop them (v2)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 00:02:33 +00:00
Kenneth Graunke
a3e4fa5495 glsl: Handle packed_type == ivec4[] in lower_packed_varyings().
For GS input arrays, we may turn a packed_type of ivec4 into an
array of ivec4s.  We still want flat qualification.

Found by inspection.  Not known to help anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-14 14:47:40 -08:00
Jason Ekstrand
f434a60a53 anv: Implement the Skylake stencil PMA optimization
Unfortunately, this doesn't substantially improve the performance of any
known apps.  With Dota 2 on my Sky Lake gt4, it seems help by somewhere
between 0% and 1% but there's enough noise that it's hard to get a clear
picture.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
d665c51eea genxml: Add the CACHE_MODE_0 register on gen9
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
028e1137e6 anv/pipeline: Be smarter about depth/stencil state
It's a bit hard to measure because it almost gets lost in the noise,
but this seemed to help Dota 2 by a percent or two on my Broadwell
GT3e desktop.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
215fed7318 anv/pipeline: Make a copy of VkPipelineDepthStencilStateCreateinfo
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
e8d52dab48 anv: Add support for the PMA fix on Broadwell
This helps Dota 2 on Broadwell by 8-9%.  I also hacked up the driver and
used the Sascha "shadowmapping" demo to get some results.  Setting
uses_kill to true dropped the framerate on the demo by 25-30%.  Enabling
the PMA fix brought it back up to around 90% of the original framerate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
62bba4ba2d genxml: Add the CACHE_MODE_1 register on gen8
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
6ce8592836 anv: Disable stencil writes when both write masks are zero
Vulkan doesn't have a stencilWriteEnable bit like it does for depth.
Instead, you have a stencil mask.  Since the stencil mask is handled as
dynamic state, we have to handle it later during command buffer
construction.  This, combined with a later commit, seems to help Dota2
on my Broadwell GT3e desktop by a couple percent because it allows the
hardware to move the depth and stencil writes to early in more cases.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-14 14:18:55 -08:00
Jason Ekstrand
114c281e70 anv/entrypoints: Only generate entrypoints for supported features
This changes the way anv_entrypoints_gen.py works from generating a
table containing every single entrypoint in the XML to just the ones
that we actually need.  There's no reason for us to burn entrypoint
table space on a bunch of NV extensions we never plan to implement.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 14:18:55 -08:00
Connor Abbott
6319bfc2a6 anv: fix Get*MemoryRequirements for !LLC
Even though we supported both coherent and non-coherent memory types, we
effectively forced apps to use the coherent types by accident. Found by
inspection, only compile tested.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-14 13:05:44 -08:00
Marek Olšák
b5eb38f071 radeonsi: implement uploading zero-stride vertex attribs
This is the only kind of user buffer we can get with the GL core profile.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 22:04:35 +01:00
Marek Olšák
b8f3b00742 gallium/radeon: include SDMA in the GPU load query
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
579ffe81f1 gallium/hud: add monitoring of API thread busy status
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
626e4ef18f gallium/u_queue: add util_queue_get_thread_time_nano
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
6c61a8bfc6 gallium/os: add per-thread time clock queries
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
5d19b503af st/mesa: tell u_vbuf that GL core doesn't have user VBOs
I think this only affects radeonsi - VI, because all other drivers using
u_vbuf probably don't support GL_DOUBLE, so they won't be affected by this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
e0f95ddd3e gallium: let state trackers tell u_vbuf whether user VBOs are possible
This can affect whether u_vbuf will be enabled or not.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
0561b3c75a vdpau: skip vlVdpOutputSurfacePutBitsNative with a zero-area rectangle
This prevents errors:
"EE r600_texture.c:1571 r600_texture_transfer_map - failed to create
 temporary texture to hold untiled copy"

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99542

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
c196efcf03 gallium/radeon: add an assertion to texture_transfer_map for app bugs
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Kai Wasserbäch <kai@dev.carbon-project.org>
2017-02-14 21:47:51 +01:00
Marek Olšák
4c36553a46 radeonsi: implement legacy GL_DOUBLE vertex formats
so that we can disable u_vbuf for GL core profiles.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
2c8ee2e825 radeonsi: clean up si_get_param
has_streamout is always true

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:47:51 +01:00
Marek Olšák
4fe1fd4df4 gallium/hud: don't use user vertex buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
00d170a5c3 gallium/hud: call u_upload_alloc only once
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
5699c8a2f7 gallium/u_upload_mgr: remove deprecated function u_upload_buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
2ca3548eb9 gallium/radeon: remove the internal u_upload_mgr pointer
also remove the BIND flags

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
1e20112abd st/mesa: use the common uploader (v2)
v2: use const_uploader

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
d3de8e1096 gallium/vl: use the common uploader
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
b1dc347822 gallium/vbuf: use the common uploader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
5fe5321633 gallium/blitter: use the common uploader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
8a84585951 gallium/primconvert: use the common uploader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
9f78ec39e9 gallium/hud: use the common uploader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
55ad59d2b7 gallium: set pipe_context uploaders in drivers (v3)
Notes:
- make sure the default size is large enough to handle all state trackers
- pipe wrappers don't receive transfer calls from stream_uploader, because
  pipe_context::stream_uploader points directly to the underlying driver's
  stream_uploader (to keep it simple for now)

v2: add error handling to nv50, nvc0, noop
v3: set const_uploader

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
998396c32e gallium/u_upload_mgr: add a helper that creates the default uploader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Marek Olšák
d71bc0d741 gallium: add common uploaders into pipe_context (v2)
For lower memory usage and more efficient updates of the buffer residency
list. (e.g. if drivers keep seeing the same buffer for many consecutive
"add" calls, the calls can be turned into no-ops trivially)

v2: add const_uploader, add documentation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2017-02-14 21:46:16 +01:00
Dave Airlie
3360dbe0c1 radv: fixup IA_MULTI_VGT_PARAM handling.
This ports the remains of the workarounds from radeonsi for
the non-TESS cases. It should provide equivalent workarounds
for hawaii and bonarie.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 20:29:19 +00:00
Dave Airlie
a465eae38f radv: fix warning since using common gs emit code
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 20:02:13 +00:00
Dave Airlie
09bf5491c4 radv: adopt some init config workarounds from radeonsi.
Just one bonaire fix.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:33 +10:00
Dave Airlie
eea562f875 radv: re-enable init gfx state on CIK.
Once the color alignment was fixed this works fine now.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:29 +10:00
Dave Airlie
5e988ac61f radv: align the initial state command buffer.
This just adds the padding to align this to an 8 dword boundary.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:21 +10:00
Dave Airlie
0f1a4220a6 radv: fix cik macroModeIndex.
This just a CIK fix ported from radeonsi.

Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 05:02:13 +10:00
Dave Airlie
06ffd29925 radv: change base aligmment for allocated memory.
On some CIK (Hawaii) this needs to be at least 64k, I'm not 100% sure
it doesn't need to be 128k.

This was causing fast clear eliminate to overwrite the previous buffer,
which since my gfx init code, was the indirect buffer.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99692
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 04:59:57 +10:00
Alex Smith
924a8cbb40 anv: Add support for shaderStorageImageWriteWithoutFormat
This allows shaders to write to storage images declared with unknown
format if they are decorated with NonReadable ("writeonly" in GLSL).

Previously an image view would always use a lowered format for its
surface state, however when a shader declares a write-only image, we
should use the real format. Since we don't know at view creation time
whether it will be used with only write-only images in shaders, create
two surface states using both the original format and the lowered
format. When emitting the binding table, choose between the states
based on whether the image is declared write-only in the shader.

Tested on both Sascha Willems' computeshader sample (with the original
shaders and ones modified to declare images writeonly and omit their
format qualifiers) and on our own shaders for which we need support
for this.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-14 08:16:52 -08:00
Alex Smith
94d48b7f9f spirv: Add support for SpvCapabilityStorageImageWriteWithoutFormat
Allow that capability if the driver indicates that it is supported, and
flag whether images are read-only/write-only in the nir_variable (based
on the NonReadable and NonWritable decorations), which drivers may need
to implement this.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 08:16:52 -08:00
Iago Toral Quiroga
5c6eaa1421 nir/spirv: do not require a format with images that are not sampled
As soon as we support shaderStorageImageWriteWithoutFormat we can see
write-only images (sampled == 2) that don't have a format specified.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-14 08:16:52 -08:00
Jason Ekstrand
2c30918581 anv/apply_pipeline_layout: Set image.write_only to false
This makes our driver robust to changes in spirv_to_nir which would set
this flag on the variable.  Right now, our driver relies on spirv_to_nir
*not* setting var->data.image.write_only for correctness.  Any patch
which implements the shaderStorageImageWriteWithoutFormat will need to
effectively revert this commit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 08:16:45 -08:00
Jason Ekstrand
f8dfe9b826 intel/isl: Add format metadata for typed reads/writes
This adds two columns to the format table as well as two helpers for
determining whether or not a given format is supported for typed reads
and writes.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 07:50:13 -08:00
Jason Ekstrand
0ef14cdc98 anv/cmd_buffer: Return a VkResult from verify_cmd_parser
This fixes a "statement with no effect" compiler warning

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 07:50:13 -08:00
Ilia Mirkin
956556b3c3 nvc0: disable linked tsc mode in compute launch descriptor
Empirically, this makes things work. Presumably this was originally
copied from the blob, which does make use of linked tsc mode.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-02-13 20:10:53 -05:00
Anuj Phogat
5e2909e732 mesa: Add EXT_frag_depth bits and enable it on all drivers
Passes the newly added piglit test for this extension on i965.

V2: Fix comments by Ilia.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-13 16:08:40 -08:00
Dave Airlie
b3b4114a0f radeonsi: use common sendmsg emission function.
This just ports radeonsi to use the sendmsg common code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 00:03:22 +00:00
Dave Airlie
e3324e0c60 radv/ac: use sendmsg emission interface.
This uses the common code to emit the correct intrinsic.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 00:03:18 +00:00
Dave Airlie
f32955be43 radeon/ac/llvm: add support for sendmsg emission
This lets us use the new intrinsic on the correct
version of llvm.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 00:02:50 +00:00
Dave Airlie
f77d2871ac radv: disable gfx init on CIK for now
Luzipher on irc report this hangs his Hawaii, disable for now
until I get time to debug.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 08:01:39 +10:00
Dave Airlie
69fc7a2c82 tgsi: fix memory leak in tgsi sanity check
This just fixes this without repeating the code.

Reported-by: Li Qiang
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 08:00:30 +10:00
Dave Airlie
62fef3e159 radv/ac: use common interp code for new intrinsics
This uses the common fs interp code to use the new
llvm intrinsics so llvm can drop the old ones.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 07:48:01 +10:00
Dave Airlie
592069c1fb radv: use indirect buffer for initial gfx state.
This puts the common gfx state for the device into an
indirect buffer, and just calls out to it, on CIK and above.

This is taken from what radeonsi does.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:45 +00:00
Dave Airlie
b26253b34d radv: start splitting init config up
This is just prep work for the following patch to use
a common gfx init indirect buffer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:34 +00:00
Dave Airlie
604e562e5b radv: don't pass physical device to si_init_ fns.
This is just a trivial cleanup.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:06 +00:00
Dave Airlie
8b47b97215 radv: detect command buffers that do no work and drop them (v2)
If a buffer is just full of flushes we flush things on command
buffer submission, so don't bother submitting these.

This will reduce some CPU overhead on dota2, which submits a fair
few command streams that don't end up drawing anything.

v2: reorganise loop to count first then malloc,
rename some vars (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:00:28 +00:00
Jason Ekstrand
d49d275c41 anv/blorp: Don't sanitize the swizzle for blorp_clear
BLORP is now smart enough to handle any swizzle (even those that contain
ZERO or ONE) in a reasonable manner.  Just let BLORP handle it.  This
fixes the following Vulkan CTS tests on Haswell:

 - dEQP-VK.api.image_clearing.clear_color_image.1d_b4g4r4a4_unorm_pack16
 - dEQP-VK.api.image_clearing.clear_color_image.2d_b4g4r4a4_unorm_pack16
 - dEQP-VK.api.image_clearing.clear_color_image.3d_b4g4r4a4_unorm_pack16

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-13 09:24:49 -08:00
Jason Ekstrand
e233db6e93 intel/blorp: Swizzle clear colors on the CPU
It's trivial to swizzle clear colors on the CPU, easily deals with the
hardware restrictions for render target swizzles, and makes swizzled
clears work on all hardware as opposed to just HSW+.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-13 09:24:43 -08:00
Emil Velikov
bd1c61261f docs: add news item and link release notes for 17.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-13 12:05:34 +00:00
Emil Velikov
437b6a136e docs: add sha256 checksums for 17.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 80b41d9899)
2017-02-13 12:02:58 +00:00
Emil Velikov
2343b8a262 docs: Update 17.0.0 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 683462e680)
2017-02-13 12:02:56 +00:00
Emil Velikov
20ccff56a0 st/xlib: remove always true ifdef GLX_EXTENSION guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:15:02 +00:00
Emil Velikov
884fd1262f xlib: remove always true ifdef GLX_EXTENSION guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:14:40 +00:00
Emil Velikov
261d5e4c6d glx: remove always true XDAMAGE_1_1_INTERFACE guard
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:14:32 +00:00
Emil Velikov
87f485e957 scons: check for libXdamage 1.1 or later
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:14:23 +00:00
Emil Velikov
43b09ee0b2 configure.ac: check for libXdamage 1.1 or later
Released back in 2007 so it should not be an issue for anyone building
from git.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:14:06 +00:00
Emil Velikov
bfac8d1749 glx: remove DRI2DriverPrimeShift compile guards
DRI2DriverPrimeShift was added in dri2proto-2.8, which we now require
as of the previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:13:46 +00:00
Emil Velikov
a1662d0dab vl: remove DRI2DriverPrimeShift compile guards
DRI2DriverPrimeShift was added in dri2proto-2.8, which we now require as
of the previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:13:29 +00:00
Emil Velikov
cd1ebd8aba scons: add missing dri2proto requirement
Noticed while skimming through, although admittedly there's many other
dependencies that are not tracked by the scons build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:13:24 +00:00
Emil Velikov
6689cc0392 configure.ac: dump dri2proto requirement to 2.8
dri2proto 2.8 was released 4+ years ago, so it must be of no surprise
for anyone building mesa from git.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:12:56 +00:00
Emil Velikov
404a5ca088 glx: remove always true ifdef guards
The two symbols referenced were introduced with v2.2 and 2.3 of
the dri2proto package and we require dri2proto >= 2.6.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-13 10:12:36 +00:00
Emil Velikov
4f080b46a8 winsys/intel: remove unused winsys - ilo was its only user
Cc: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-13 10:09:52 +00:00
Emil Velikov
6ffddba33b configure.ac: do not use deprecated macros - AC_HELP_STRING AC_ERROR
Replace with AS_HELP_STRING and AC_MSG_ERROR respectively, as spotted by
autoupdate.

Note that the suggested AC_CANONICAL_SYSTEM > AC_CANONICAL_TARGET change
is not addressed here since that requires very extensive testing.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-13 10:09:45 +00:00
Timothy Arceri
0cbde643eb util/disk_cache: correctly use stat(3)
I forgot to error check stat() and also I wasn't using the subdir in
is_two_character_sub_directory().

Fixes: d7b3707c61 "util/disk_cache: use stat() to check if entry is a directory"
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-13 10:01:12 +00:00
Michel Dänzer
0f53404565 configure.ac: Drop LLVM compiler flags more radically
Drop all -m*, -W*, -O*, -g* and -f* flags, with the exception of
-fno-rtti, which must be used if it's part of the llvm-config --cxxflags
output. We don't want LLVM to dictate the flags we use, and it can even
cause build failures, e.g. if LLVM and Mesa are built with different
compilers.

While we're at it, eat any whitespace preceding dropped flags as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-13 16:07:37 +09:00
Kenneth Graunke
57dc6d80a0 glsl: Drop resize-to-MaxPatchVertices hack.
TCS and TES inputs without an array size are implicitly sized to
gl_MaxPatchVertices.  But TCS outputs are apparently not:

   "If no size is specified, it will be taken from the output patch size
    (gl_VerticesOut) declared in the shader."

Fixes dEQP-GLES31.functional.program_interface_query.program_output.
array_size.separable_tess_ctrl.var.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:25 -08:00
Kenneth Graunke
1fad070f96 mesa: Ignore per-vertex array size in SSO pipeline validation.
We were already unwrapping types when the producer was a non-array
stage and the consumer was an arrayed-stage...but we ought to unwrap
both ends for TCS -> TES matching too.

This will allow us to drop the "resize to gl_MaxPatchVertices" check
shortly, which breaks some things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:23 -08:00
Kenneth Graunke
e99df398f1 glsl: Update a comment about link errors for TCS && !TES.
OpenGL ES actually has spec text to prohibit this.  It's just OpenGL
that's confusing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:21 -08:00
Kenneth Graunke
365afbdaef mesa: Do a draw time check for TES && !TCS in ES 3.x.
ES 3.x requires both TCS and TES to be present.  We already checked
the TCS && !TES case above, so we just have to check !TCS && TES here.

Note that this is allowed in OpenGL, just not ES.

This fixes a subcase of:
dEQP-GLES31.functional.debug.negative_coverage.*.tessellation.single_tessellation_stage

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:19 -08:00
Kenneth Graunke
05a56893aa mesa: Do (TCS && !TES) draw time validation in ES as well.
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.

Having a TCS but no TES may confuse drivers - i965 crashes, for example.

This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.

v2: Add an ES spec citation (suggested by Alejandro)

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:14 -08:00
Jason Ekstrand
c59d1ea51b i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge
Fixes two GL ES 3.0 CTS tests on Sandy Bridge:

ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-02-12 17:56:32 -08:00
Jason Ekstrand
c4f8f395b2 i965/sampler_state: Pass texObj into update_sampler_state
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-02-12 17:56:32 -08:00
Jason Ekstrand
9df3778016 i965/sampler_state: Clamp min/max LOD to 14 on gen7+
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-02-12 17:56:32 -08:00
Ilia Mirkin
3970257cef st/mesa: don't pass compare mode for stencil-sampled textures
Fixes dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2017-02-12 19:26:25 -05:00
Ilia Mirkin
3f8b886e73 nv50,nvc0: use alternate samplers for stencil
The blob uses these, and it fixes a bunch of dEQP stencil sampling tests
involving border colors. Probably the Z-based samplers work somehow
differently wrt border colors when using the stencil swizzle.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-12 18:22:17 -05:00
Bas Nieuwenhuizen
1811ccf125 radv: Fix radv_GetPhysicalDeviceQueueFamilyProperties2KHR.
The struct have different size, so the arrays have different stride.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-13 00:18:19 +01:00
Wladimir J. van der Laan
55e00c7cfe etnaviv: Set shader instruction area correctly for GC3000
- Use the same instruction area on GC3000 as the Vivante driver.
  This allows the same number of instructions on GC3000 as GC2000
  instead of half.

- Makes sure that the "PE to FE" stall before updating the shader code
  or constants is hit (which is conditional on vs_offset > 0x4000). This
  is necessary on GC3000 too, it increases stability.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-12 20:42:37 +01:00
Wladimir J. van der Laan
0fe60e4fcc etnaviv: Update hw header files
Update from etnaviv repository rnndb. This adds some newly
discovered state for GC3000 (and some GC2000) features.

Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-02-12 20:38:56 +01:00
Dave Airlie
f466d4dd6a radv: reduce CPU overhead merging bo lists.
Just noticed we do a fair bit of unneeded searching here.

Since we know that the buffers in a CS are unique already,
the first time we get any buffers, we can just memcpy those into
place, and when we are searching for subsequent CSes, we only
have to search up until where the previous unique buffers were.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-12 19:00:19 +00:00
Ilia Mirkin
48f04862c1 nvc0: set the render condition in the compute object
Fixes GL45-CTS.compute_shader.conditional-dispatching

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-02-11 21:06:52 -05:00
Ilia Mirkin
7e75f0913a gm107/ir: fix address offset bitfield for ATOMS
Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2017-02-11 21:06:41 -05:00
Ilia Mirkin
b38aab50a0 nv50/ir: convert an ATOM.EXCH without a destination into a store
On SM35 there does not appear to be a way to emit a ATOM.EXCH with a
null destination. This should be functionally equivalent to a plain
store however, so just do that.

Fixes GL45-CTS.compute_shader.atomic-case2 on SM35.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-11 20:25:26 -05:00
Ilia Mirkin
2b0580123e nvc0: fix 64-bit integer query buffer writes
The former logic just plain didn't work at all. We need to write the
subsequent dword to the next buffer location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-11 20:25:26 -05:00
Ilia Mirkin
399e267f0e nv50/ir: return a register when retrieving thread id sysval
We have logic to short-circuit such retrievals to zero. However "zero"
was an immediate, and some logic expected to get registers (to later be
propagated). Fix this by using loadImm.

Fixes GL45-CTS.gpu_shader5.images_array_indexing

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-11 20:25:26 -05:00
Ilia Mirkin
0d1edb01ec nv50/ir: add missing break after DSSG
Recently broken during int64 addition.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-11 17:21:55 -05:00
Christian Gmeiner
137ad879d5 etnaviv: shader-db traces
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
2017-02-11 21:22:53 +01:00
Christian Gmeiner
7256ed3c79 etnaviv: keep track of emitted loops
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-02-11 21:22:48 +01:00
Christian Gmeiner
5a3ea68895 etnaviv: wire up core pipe_debug_callback
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
2017-02-11 21:22:42 +01:00
Jose Maria Casanova Crespo
5bc222ebaf glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":

"If an array is declared as the last member of a shader storage block
 and the size is not specified at compile-time, it is sized at run-time.
 In all other cases, arrays are sized only at compile-time."

In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.

With this patch Mesa reports a compilation error as glslang does with
the following shader:

buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}

Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-10 23:14:12 -08:00
Eric Anholt
0514b0bdc9 vc4: Enable glSampleMask() even when !rasterizer->multisample.
gallium's blitter expects that it can set the sample mask even when the
rasterizer doesn't have the flag on.

Between this and the previous test, 10 new ext_framebuffer_multisample
tests start passing.
2017-02-10 14:17:05 -08:00
Eric Anholt
5c86f119b9 vc4: Respect glSampleMask() even when we're not writing color.
gallium's quad-based blitter for copying MSAA depth textures expects to be
able to do 4 passes updating a sample at a time using glSampleMask, and
there's no color buffer bound when it's doing that.
2017-02-10 14:17:04 -08:00
Eric Anholt
30237193f5 vc4: Use the nir_builder helper for loading sample mask. 2017-02-10 14:17:04 -08:00
Eric Anholt
ce538a443d vc4: Use accurate 1/w in coordinate shader as well as vert shader.
We probably shouldn't be emitting different scaled viewport coordinates
between vertex and coord.
2017-02-10 14:17:04 -08:00
Eric Anholt
a0b6841838 vc4: Drop VS inputs to 8.
In the hardware we only get to declare 8 vertex elements (GLES2's
minimum), so we should be exposing that number here.  Fixes an assertion
failure in piglit texrect-many, at the expense of various GL 2.0-ish
minmax tests now complaining that our count is too low.
2017-02-10 14:17:04 -08:00
Eric Anholt
b230939303 vc4: Avoid emitting small immediates for UBO indirect load address guards.
The kernel will reject our shader if we emit one here, and having 4, 8, or
12 as the top end of our UBO clamp rare is enough that it's not worth
making the kernel let us.

Fixes piglit fs-const-array-of-struct and
fs-const-array-of-struct-of-array since recent GLSL linking changes made
us get this as an indirect load of a uniform, instead of a tempoary.

Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-10 14:17:04 -08:00
Timothy Arceri
d7b3707c61 util/disk_cache: use stat() to check if entry is a directory
d_type is not supported on all systems.

Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97967
2017-02-10 23:50:36 +11:00
Emil Velikov
463236bd31 st/nine: update configure options in the README
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-10 11:47:24 +00:00
Emil Velikov
b3b415609d configure.ac: supersede --enable-gallium-llvm over --enable-llvm
Currently we have extra (somewhat questionable) modularity, such that
one could build some parts with LLVM while others w/o.

That is extremely fragile, error prone and requires quite noticable
amount of code throughout.

Thus lets deprecate the gallium toggle in faviour of the generic one.
The former will throw a warning when set, and it will be overwritten by
the latter. This will allow gradual transition w/o breaking people's
scripts.

v2: Rebase, document in release notes.

Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)
2017-02-10 11:47:24 +00:00
Emil Velikov
bdd6147e29 configure.ac: remove dummy radeon_gallium_llvm_check()
The extra function brings no added benefit as of earlier commit which
made llvm_require_version (as called by radeon_llvm_check) require LLVM
(--enable-gallium-llvm).

Fixes: 5f966a96af7 "configure.ac: Mandate --enable-gallium-llvm when
checking LLVM version"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:24 +00:00
Emil Velikov
d4840c0c26 configure.ac: correctly manage llvm auto-detection
Earlier refactoring commits changed from one, dare I say it, broken
behaviour to another. Namely:

Before, as you explicitly --enable-gallium-llvm your selection was
ignored when llvm-config was not present/detected.
Today, the "auto" heuristics enables gallium llvm regardless if you have
llvm/llvm-config available or not.

Rework the auto-detection to attribute for llvm's presence.

v2: Set enable_gallium_llvm=no when LLVM is not found.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-02-10 11:47:24 +00:00
Emil Velikov
ce65cc1f1f configure.ac: disable enable_gallium_llvm in the !x86 case
Already implicitly handled throughout, but keep it clear and disable
gallium-llvm. This change should be a no-op.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:24 +00:00
Emil Velikov
4d8bb9cf8c configure.ac: set LLVM_{C, CXX, LD}FLAGS only as needed
Earlier refactoring commits started setting the above regardless if LLVM
is used or not. Move them to the respective section to restore the
original functionality.

Since we require the preprocessor flags (includes in particular) for the
header version parsing keep those as-is. They are not used outside of
configure.ac thus should not cause any side-effects.

As-is adding the C/CXXFLAGS can lead to build issues on when
cross-compiling.

Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:24 +00:00
Emil Velikov
fc30992a54 Revert "configure.ac: Create correct LLVM_VERSION_INT with minor >= 10"
As stated in [1] by the LLVM devs, the new versioning scheme will not
deploy any minor version (i.e. it will always be zero). As such the
patch should not be needed.

This reverts commit 0e9a5be7e7.

[1] http://blog.llvm.org/2016/12/llvms-new-versioning-scheme.html
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:24 +00:00
Emil Velikov
5e9f4a5f3f configure.ac: don't use == with test
Although it works, it's not the correct thing to do.

v2: Rebase
v3: Rebase

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)
2017-02-10 11:47:23 +00:00
Emil Velikov
65ee9dff69 configure.ac: remove unused LLVM variables
LLVM_BINDIR is completely unused while others such as LLVM_LIBDIR are
used only internally. In the latter case there's no need to AC_SUBST it.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Tobias Droste
143c566a81 configure.ac: Only define HAVE_LLVM if LLVM is used
Make sure that HAVE_LLVM compiler define is only set if LLVM is
actually used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tobias Droste <tdroste@gmx.de>
v2 [Emil] fold within the existing conditional
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-10 11:47:23 +00:00
Tobias Droste
04377cbdcf configure.ac: Rework MESA_LLVM and LLVM detection
Set FOUND_LLVM only when LLVM is present (checking for exact version/etc
is deferred) and use enable-gallium-llvm to indicate the global LLVM
status.

Renaming the latter is not appropriate for stable patches, so we'll
address it with a later commit.

Loosely based on work by Tobias.

v2: Check FOUND_LLVM if enable_gallium_llvm is set.

Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Emil Velikov
5869a7db75 configure.ac: move enable-gallium-llvm dependency with-gallium-drivers
... to where it's applicable.

Since we effectively made --enable-gallium-llvm mean --enable-llvm with
earlier commits, we need to move the requirement to guard the compnents
added for the LLVM draw.

Otherwise we'll error (as below) when building RADV w/o gallium drivers.

configure: error: --enable-gallium-llvm is required when building radv

v2: Don't remove but move the dependency (Tobias).

Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Emil Velikov
a66ffcd736 configure.ac: Mandate --enable-gallium-llvm when checking LLVM version
With this change we effectively require --enable-gallium-llvm when
building RADV. This should be perfectly safe since the gallium radeonsi
driver already explicitly requires it.

The "gallium" part in --enable-gallium-llvm is about to be removed soon
(not in stable), but until then make sure that things can build.

To reflect the requirement (as opposed to check previously) we rename
llvm_check_version_for to llvm_require_version

Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Emil Velikov
514a494415 configure.ac: Rename the gallium_require_llvm helper
Drop the gallium prefix since we're about it use it throughout the
configure.

Note we do want to check for enable_gallium_llvm check since (as
explicitly requested) the toggle should mean --enable-llvm. Latter of
which to be resolved with later patches.

Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Tobias Droste
f64d4d82bd configure.ac: Don't check LLVM version in require_llvm
This is actually not needed because the version is checked later.

Around line 2380
if test "x$enable_gallium_llvm" == "xyes"; then
    llvm_check_version_for $LLVM_REQUIRED_GALLIUM "gallium"
    llvm_add_default_components "gallium"
fi

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
v2: [Emil Velikov: rebase/respin series order]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-10 11:47:23 +00:00
Emil Velikov
38abcdba8a configure.ac: move AC_ARG_ENABLE([gallium-llvm] hunk further up
With next commits we'll require --enable-gallium-llvm (en route to a
greater good later on) for RADV. The latter is required to ensure that
as otherwise we'll fail to build.

Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:23 +00:00
Emil Velikov
3a7973fd15 configure.ac: remove unused AC_SUBST([MESA_LLVM])
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
2017-02-10 11:47:22 +00:00
Nicolai Hähnle
de6e6a347d loader: unconditionally include unistd.h and stdlib.h
Otherwise we would fail with "implicit declaration of function" geteuid
and getenv respectively.

To trigger (re)move the libdrm.pc file and use the following:

 $ ./autogen.sh --disable-egl --disable-gbm --disable-dri \
    --with-dri-drivers=swrast --with-gallium-drivers=swrast
 $ make

Cc: Vinson Lee <vlee@freedesktop.org>
Fixes: 3f462050c ("loader: Add an environment variable to override driver name choice.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99701
v2: [Emil: handle stdlib.h add commit message]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-10 11:47:12 +00:00
Emil Velikov
a04cb3f8a5 intel/blorp: do not return const data by get_px_size_sa()
Not much point in the const qualifier since we provide a copy to the
user. Resolves the following -Wignored-qualifiers warning.

src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on
return type has no effect [-Wignored-qualifiers]

v2: keep const qualifier of local variable.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-10 11:47:12 +00:00
Marek Olšák
43a2ba1b7d gallium/radeon: use staging for texture read mappings from GTT WC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
dc7483f445 gallium/radeon: ignore the level parameter in buffer_transfer_map
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
d86099df0a gallium/radeon: fix performance of buffer readbacks
We want cached GTT for all non-persistent read mappings.
Set level = 0 on purpose.

Use dma_copy, because resource_copy_region causes a failure in the PBO
read of piglit/getteximage-luminance.

If Rocket League used the READ flag, it should get cached GTT.

v2: mask out UNSYNCHRONIZED

Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
24e3b06408 radeonsi: align vertex buffer descriptor list size for optimal prefetch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
3a534c5c7d radeonsi: align shader binaries to CP DMA alignment for optimal prefetch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
1a392a4377 radeonsi: move CP_DMA_ALIGNMENT definition
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
4c288c73ea radeonsi: remove SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER
not necessary

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
65df38b191 radeonsi: remove separate CB/DB_META flush flags
not used separately

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
8a2ae4153b radeonsi: reduce the number of FMASK input coordinates
Before:
  image_load v3, v[0:3] ...
After:
  image_load v3, v[0:1] ...

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
28c06b3ceb radeonsi: write shader asm annotated with wave info into GPU hang reports
Note that the disassembly is written twice - first the unmodified compiler
output and then the wave-annotated output only if there are waves executing
the shader.

Sample output from a real GPU hang most likely caused by image_sample:

The number of active waves = 28

Pixel Shader - annotated disassembly:
    s_mov_b64 s[6:7], exec                                        ; BE86017E [PC=0x10f3e3800, off=0, size=4]
    s_wqm_b64 exec, exec                                          ; BEFE077E [PC=0x10f3e3804, off=4, size=4]
...
    image_sample v[7:9], v[0:1], s[12:19], s[20:23] dmask:0x7     ; F0800700 00A30700 [PC=0x10f3e3a94, off=660, size=8]
    s_buffer_load_dword s20, s[0:3], 0x50                         ; C0220500 00000050 [PC=0x10f3e3a9c, off=668, size=8]
    s_load_dwordx4 s[24:27], s[4:5], 0x170                        ; C00A0602 00000170 [PC=0x10f3e3aa4, off=676, size=8]
    s_load_dwordx8 s[12:19], s[4:5], 0x140                        ; C00E0302 00000140 [PC=0x10f3e3aac, off=684, size=8]
    s_buffer_load_dword s11, s[0:3], 0x5c                         ; C02202C0 0000005C [PC=0x10f3e3ab4, off=692, size=8]
    s_buffer_load_dword s21, s[0:3], 0x54                         ; C0220540 00000054 [PC=0x10f3e3abc, off=700, size=8]
    s_buffer_load_dword s22, s[0:3], 0x58                         ; C0220580 00000058 [PC=0x10f3e3ac4, off=708, size=8]
    s_waitcnt vmcnt(0)                                            ; BF8C0F70 [PC=0x10f3e3acc, off=716, size=4]
          ^ SE0 SH0 CU1 SIMD1 WAVE0  EXEC=aaaaaaa555aaaaaa  INST32=BF8C0F70
          ^ SE0 SH0 CU1 SIMD2 WAVE0  EXEC=aaaa85555555552a  INST32=BF8C0F70
          ^ SE0 SH0 CU1 SIMD3 WAVE0  EXEC=000000000000000a  INST32=BF8C0F70
          ^ SE0 SH0 CU6 SIMD1 WAVE0  EXEC=25a5a5aa82aaaaaa  INST32=BF8C0F70
          ^ SE0 SH0 CU6 SIMD3 WAVE0  EXEC=50aaaa8fffa55555  INST32=BF8C0F70
          ^ SE0 SH0 CU7 SIMD0 WAVE0  EXEC=5554aaaaaaa1a555  INST32=BF8C0F70
          ^ SE0 SH0 CU7 SIMD0 WAVE1  EXEC=aaaa5555ffffffff  INST32=BF8C0F70
          ^ SE0 SH0 CU7 SIMD1 WAVE0  EXEC=555557aaaaaaaaa5  INST32=BF8C0F70
          ^ SE0 SH0 CU7 SIMD3 WAVE0  EXEC=5555aaaaaaaaaa85  INST32=BF8C0F70
          ^ SE1 SH0 CU3 SIMD1 WAVE0  EXEC=aaaaaaaaaaaaaaaa  INST32=BF8C0F70
          ^ SE1 SH0 CU4 SIMD0 WAVE0  EXEC=aaaaaaaa5a5a5a5a  INST32=BF8C0F70
          ^ SE1 SH0 CU4 SIMD1 WAVE0  EXEC=aaaaaaa5a5a5a4a5  INST32=BF8C0F70
          ^ SE1 SH0 CU4 SIMD2 WAVE0  EXEC=5555555000000000  INST32=BF8C0F70
          ^ SE1 SH0 CU4 SIMD3 WAVE0  EXEC=aa555554155aaaaa  INST32=BF8C0F70
          ^ SE1 SH0 CU5 SIMD0 WAVE0  EXEC=55ffff55555555aa  INST32=BF8C0F70
          ^ SE1 SH0 CU5 SIMD1 WAVE0  EXEC=555555555aaaaaaa  INST32=BF8C0F70
          ^ SE1 SH0 CU5 SIMD2 WAVE0  EXEC=a0aaaaaaa8555555  INST32=BF8C0F70
          ^ SE1 SH0 CU5 SIMD3 WAVE0  EXEC=8aaaaaaaaaaaa555  INST32=BF8C0F70
          ^ SE1 SH0 CU6 SIMD0 WAVE0  EXEC=000000002aaaaaaa  INST32=BF8C0F70
          ^ SE2 SH0 CU1 SIMD0 WAVE0  EXEC=5aaaa5400aaaa15a  INST32=BF8C0F70
          ^ SE2 SH0 CU1 SIMD1 WAVE0  EXEC=00aaaaaaaa5555aa  INST32=BF8C0F70
          ^ SE2 SH0 CU1 SIMD2 WAVE0  EXEC=aa00005555554555  INST32=BF8C0F70
          ^ SE2 SH0 CU1 SIMD3 WAVE0  EXEC=aaaaaaa000000000  INST32=BF8C0F70
          ^ SE3 SH0 CU4 SIMD0 WAVE0  EXEC=5555aaaaaaaaaaaa  INST32=BF8C0F70
          ^ SE3 SH0 CU4 SIMD2 WAVE0  EXEC=ffaaaaaaaaaa5555  INST32=BF8C0F70
          ^ SE3 SH0 CU4 SIMD3 WAVE0  EXEC=aaaa55555555aa00  INST32=BF8C0F70
          ^ SE3 SH0 CU5 SIMD0 WAVE0  EXEC=00aaaaaaaaaaaa5a  INST32=BF8C0F70
          ^ SE3 SH0 CU5 SIMD1 WAVE0  EXEC=5a555555005555ff  INST32=BF8C0F70
    v_mul_f32_e32 v7, s6, v7                                      ; 0A0E0E06 [PC=0x10f3e3ad0, off=720, size=4]
...

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marek Olšák
3de8c5a3c5 radeonsi: write wave information into GPU hang reports
UMR is our new debugging tool. It must have +s set for Mesa to use it
without root privileges:
  sudo chmod +s .../umr

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-10 11:27:50 +01:00
Marc-André Lureau
dc2d9b8da1 tgsi-dump: dump label if instruction has one
The instruction has an associated label when Instruction.Label == 1,
as can be seen in ureg_emit_label() or tgsi_build_full_instruction().

This fixes dump generating extra :0 labels on conditionals, and virgl
parsing more than the expected tokens and eventually reaching "Illegal
command buffer" (when parsing more than a safety margin of 10 we
currently have).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-10 12:46:33 +10:00
Marc-André Lureau
bd1cab1168 tgsi: remove ureg_label_insn
Unused since commit 2897cb3dba.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-10 12:46:23 +10:00
Dave Airlie
e5a5d17d13 radv: handle queue submission with no cs but semaphores
It's legal to submit just semaphores with no command streams,
this patch fixes this case by emitting the empty cs, it also
handles the fence emission for this case better.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-09 23:45:33 +00:00
Timothy Arceri
a4086bb531 util/disk_cache: error check asprintf()
Fixes: f3d911463e "util/disk_cache: stop using ralloc_asprintf() unnecessarily"

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-10 09:25:32 +11:00
Timothy Arceri
41ad178b13 docs: add shader cache environment variables
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-10 09:22:52 +11:00
Ilia Mirkin
c95f821cb4 nvc0/ir: fix ubo max clamp, reset file index
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-02-09 15:50:58 -05:00
Ilia Mirkin
e4a698cb97 nv50/ir: always return 0 when trying to read thread id along unit dim
Many many many compute shaders only define a 1- or 2-dimensional block,
but then continue to use system values that take the full 3d into
account (like gl_LocalInvocationIndex, etc). So for the special case
that a dimension is exactly 1, we know that the thread id along that
axis will always be 0, so return it as such and allow constant folding
to fix things up.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-02-09 15:15:36 -05:00
Ilia Mirkin
1acdd62847 nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.

Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-02-09 15:15:22 -05:00
Ilia Mirkin
59ca352fc5 nvc0: increase number of ubo binding points
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.

Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-02-09 15:15:08 -05:00
Ilia Mirkin
8a2d88e934 configure: add blurb about what the LIBDRM_*_REQUIRED stuff means
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-09 12:57:49 -05:00
Ilia Mirkin
1e4f5988ed nvc0: expose int64
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:49 -05:00
Ilia Mirkin
ab00a41a6e nvc0/ir: make it possible to have the flags def in def0
There's all kinds of logic that doesn't like there being holes in defs
or srcs lists. Avoid them. This also fixes the sched logic for maxwell.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
61d7676df7 nvc0/ir: add support for 64-bit shift lowering on SM20/SM30
Unfortunately there is no SHF.L/SHF.R instruction pre-SM35. So we have
to do a bit more work to get the job done.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
1aefd6159c nvc0/ir: add support for all the new int64 tgsi opcodes
A few thoughts:
 - Some of that LegalizeSSA logic should really live much earlier and be
   subject to the likes of DCE and other useful passes
 - Some of the "lowering" done in from_tgsi should be done later so that
   proper optimization might be done.

However this all works and the above can be improved upon later.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Pierre Moreau
009c54aa7a nv50/ir: Split 64-bit integer MAD/MUL operations
Hardware does not support 64-bit integers MAD and MUL operations, so we need
to transform them in 32-bit operations.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
22c705ea8c nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bit
Note that this is not available for SM20/SM30.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
2e986fa806 nvc0/ir: fix SET and SLCT emission
We were never emitting a .X flag for consuming condition code on SET,
and weren't emitting a signed type for SLCT comparison. Discovered while
working on int64 logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
eac5099c11 nvc0/ir: add support for emitting partial min/max ops for int64
These operations allow you to compute min/max on arbitrary-width
integers, 32 bits at a time.

Note that the low/med ops implicitly set the condition code, and the
med/high ops implicitly consume it.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-09 12:57:48 -05:00
Ilia Mirkin
b090033087 gallium: add separate PIPE_CAP_INT64_DIVMOD
Nouveau does not currently have logic to implement this as a library
function. Even though such a library could be written, there's no big
advantage to do it that way for now given that int64 is a very uncommon
use-case. Allow a driver to expose INT64 without supporting division and
modulo operations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-09 12:57:21 -05:00
Eric Engestrom
6a71a69a12 docs: improve the list of gl implementations
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-09 15:45:08 +00:00
Eric Engestrom
8278f1ec35 docs: improve the list of implemented APIs
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-09 15:44:51 +00:00
Matt Turner
d7a0486a9e glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...
Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed
an OpenGL 3.3 compatibility profile context (with various unimplemented
features and bugs), but still refused to compile shaders with

   #version 330 compatibility

This patch simply adds a small bit of plumbing to let that through.

Of course the same caveats apply: compatibility profile is still not
supported (and will not be supported), so there are no guarantees that
anything will work.

Tested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-02-09 15:14:43 +00:00
Eric Engestrom
89b4176eb1 docs: reword sentence that my brain can't parse
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2017-02-09 13:04:16 +00:00
Eric Engestrom
30cf9ffb59 docs: https all the links \o/
Most of them already redirected to https anyway, so we might as well
avoid the redirection and the security implications by linking directly
to the right protocol.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-09 11:28:15 +00:00
Eric Engestrom
2b0fe3cff7 docs: fix gallium wiki link in relnotes
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-09 11:28:10 +00:00
Eric Engestrom
9f8a6a5b79 docs: update 'thanks' for hosting
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-09 11:26:22 +00:00
Samuel Iglesias Gonsálvez
ca16f0a282 i965/fs: add support for int64 to bool conversion
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez
824e1bb078 nir: add opcode to perform int64 to bool conversions
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez
7ab26613db i965/fs: Add support for nir_op_[iu]2[iu]32
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez
7b5834ff54 i965/fs: Add support for nir_op_[iu]642f
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Samuel Iglesias Gonsálvez
b115407d75 i965/fs: legalize [u]int64 to 32-bit data conversions in lower_d2x
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Jason Ekstrand
8734461c58 i965/fs: Add support for nir_op_[iu]642d
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-02-09 10:18:34 +01:00
Jason Ekstrand
91d2d26f33 i965: Allow int64 conversion operations in channel_expressions
This fixes 143 of the new piglit tests added by Nicolai

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-02-09 10:18:34 +01:00
Timothy Arceri
f3d911463e util/disk_cache: stop using ralloc_asprintf() unnecessarily
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 14:11:24 +11:00
Timothy Arceri
0bf21519b7 glsl: add param to force shader recompile
This will be used to skip checking the cache and force a recompile.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 12:22:56 +11:00
Timothy Arceri
4026b45bbc util: add a disk_cache_remove() function
This will be used to remove cache items created with old versions
of Mesa or other invalid cache items from the cache.

V2: rename stub function (cache_* funtions were renamed disk_cache_*)
in master.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 12:22:56 +11:00
Timothy Arceri
a3fd8bb8c5 st/mesa/i965: create link status enum
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.

V2:
 - don't return the new enum directly to the application when queried,
   instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
   corruptions when using cache.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 12:22:56 +11:00
Brian Paul
ac5845453c docs: update intro.html to mention new APIs, etc
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-02-09 00:02:20 +00:00
Brian Paul
b2722a8970 docs: the site is now hosted by freedesktop.org
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2017-02-09 00:02:13 +00:00
Bas Nieuwenhuizen
f22836dbdd radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32.
For allowing fast color clears in the main render targets of dota2.

[airlied: fix clear_vals[1] as suggested by Andres.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-08 22:43:11 +00:00
Roland Scheidegger
f64d74aa19 mesa: (trivial) include <inttypes.h> for PRIx64 macros
Fixes a compile error with mingw.
2017-02-08 21:56:16 +01:00
Tim Rowley
c1aa444a3e swr: [rasterizer jitter] Pass LLVM-IR size into jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:13 -06:00
Tim Rowley
e0a829d320 swr: [rasterizer core] Frontend SIMD16 WIP
Removed temporary scafolding in PA, widended the PA_STATE interface
for SIMD16, and implemented PA_STATE_CUT and PA_TESS for SIMD16.

PA_STATE_CUT and PA_TESS now work in SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:06 -06:00
Tim Rowley
79174e52b5 swr: [rasterizer jitter] Disable unsafe FP optimizations in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:58:00 -06:00
Tim Rowley
db599e316a swr: [rasterizer core] Frontend SIMD16 WIP
Widen simdvertex to SIMD16/simd16vertex in frontend for passing VS
attributes from VS to PA.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:52 -06:00
Tim Rowley
09c54cfd2d swr: [rasterizer jitter] Add DEBUGTRAP jit builder function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:47 -06:00
Tim Rowley
b01f26e005 swr: [rasterizer jitter] Multisample blend jit fix
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:41 -06:00
Tim Rowley
8780706c62 swr: [rasterizer jitter] Change SimdVector representation to array
Make all SimdVectors in LLVM represented as simdscalar[4] rather
than a struct.

Fixes issues with promotion of values from i32 to i64 to match
register width.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:33 -06:00
Tim Rowley
d159b0bf34 swr: [rasterizer jitter] Fix issues with stream-out on llvm>=3.8
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:27 -06:00
Tim Rowley
8423ad437b swr: [rasterizer jitter] Adjust jitter header includes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:20 -06:00
Tim Rowley
feecd7dcf5 swr: [rasterizer core] Frontend SIMD16 WIP
SIMD16 Primitive Assembly (PA) only supports TriList and RectList.

CUT_AWARE_PA, TESS, GS, and SO disabled in the SIMD16 front end.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-02-08 13:57:10 -06:00
Eric Engestrom
a618d6c3e9 docs: update package contents
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-08 12:00:28 -07:00
Eric Engestrom
06e40dc671 docs: fix unpacking instructions
File names were wrong, file formats were wrong, bunzip command was
wrong...

I also removed all but the simplest example; people who use pipes already
know how to untar, so let's simplify and remove potential confusion for
non-tech-savvy users.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-08 12:00:24 -07:00
Eric Engestrom
d7e1a16f1a docs: remove dead 'beta' link
Release candidates haven't been in a 'beta' subdir in a long time, so let's
replace the dead link with an explanation instead.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-08 12:00:19 -07:00
Eric Engestrom
5b10c362de docs: add a note about the new version scheme
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-02-08 12:00:14 -07:00
Bartosz Tomczyk
94262e5f5d r600/sb: Fix memory leak
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-02-08 17:36:05 +01:00
Timothy Arceri
90014d0766 mesa: use PRId64/PRIu64 when printing 64-bit ints
V2: actually use PRIu64

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-08 13:50:01 +11:00
Dave Airlie
c674f11e42 mesa/st: fix strict aliasing issue in int64 code.
This fixes the int64 code same as the double code.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-08 02:13:07 +00:00
Dave Airlie
30cff4f5f7 mesa/uniform: fix strict aliasing issues with int64 code.
This fixes these like the double version does.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-08 02:12:31 +00:00
Dave Airlie
6d5d6dad20 radv: handle dcc in explicit image resolve path. (v2)
We need to initialize dcc like we do in the subpass path.

v2: fix initial/final layouts
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-07 23:31:08 +00:00
Bas Nieuwenhuizen
0d1283850b radv: Enable fast clears by default.
Works for me on dota2 and talos now.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
2017-02-07 22:58:06 +01:00
Jason Ekstrand
1de3cd8a34 spirv: Add more asserts in vtn_vector_construct
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99465
2017-02-07 08:08:06 -08:00
Emil Velikov
25aa98c014 configure.ac: remove src/gallium/winsys/intel/drm/Makefile reference
Not wired up (not referenced in any SUBDIR), leading to `make distcheck'
failure.

Fixes: d77fa310ed "ilo: EOL drop unmaintained gallium drv from buildsys"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-07 14:18:13 +00:00
Emil Velikov
73bce69938 docs: reword ilo removal note
Properly annotate <li> and keep the note analogous to all the previous
ones - OpenVG, st/egl, etc.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-07 14:18:12 +00:00
Boyan Ding
97495c428d configure.ac: Remove redundant libglvnd stanza
There were two "libglvnd configuration" section in the squashed commit
that added libglvnd support, while only one in the original libglvnd
branch. A following commit moves one of them downwards. Now remove the
upper "older" one and move GL_LIB name decision downwards after the new
libglvnd configuration section.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
2017-02-07 14:17:53 +00:00
Emil Velikov
bef4d74047 travis: use both cores for make/make check
The instance offers 2 cores, so use them to speed things up.

v2: Set MAKEFLAGS instead [Eric]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-02-07 11:14:10 +00:00
Emil Velikov
30267172c7 travis: add nearly all gallium drivers to the list
Note: we need the explicit --enable-freedreno for libdrm since the
latter is 'smart' and disables it if building on !arm platforms.

The radeonsi and swr are explicitly left out since they require
'too-recent' LLVM - 3.6

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-02-07 11:14:10 +00:00
Emil Velikov
96d86b18ee travis: correct libdrm required regex to also track libdrm itself
The current regex was tracking only the libdrm_foo packages, while with
recent changed we bumped only (and rightfully so) libdrm.

Fix the regex to track any libdrm package.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-02-07 11:14:10 +00:00
Emil Velikov
49f6408940 configure.ac: add swr to the gallium drivers list.
v2: Rebase on top of ILO removal.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-02-07 11:14:10 +00:00
Emil Velikov
9d5b681a11 configure.ac: list all the dri-drivers in the help string
It's unlikely that any of the additions come as a suprise to anyone
i915, nouveau, radeon, r200. Regardless, state clearly what's
available.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-02-07 11:14:09 +00:00
Marc Di Luzio
21efe2528c glsl: correct compute shader checks for memoryBarrier functions
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."

Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute

This allows some fragment shaders in Dirt Rally to compile

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-06 21:12:33 -08:00
Li Qiang
83fb63d31d gallium/tgsi: fix oob access in parse instruction
When parsing texture instruction, it doesn't stop if the
'cur' is ',', the loop variable 'i' will also be increased
and be used to index the 'inst.TexOffsets' array. This can lead
an oob access issue. This patch avoid this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Li Qiang <liq3ea@gmail.com>
2017-02-07 14:00:04 +10:00
Kenneth Graunke
ce8a63de6d Revert "i965: Disable guardband clipping in the smaller-than-viewport case."
This reverts commit 0bac2551e4.

Now that we position the guardband correctly (applying translations
in addition to scaling) and made it as large (or larger) than the
render target, this shouldn't be necessary.

Now we leave guardband clipping enabled 100% of the time, like the
Windows driver does.

Fixes GL45-CTS.gtf21.GL2FixedTests.clip.clip.  It tries to draw a
16384x64 rectangle, and it appears that some kind of numerical
imprecisions in the clipper result in some edge pixels going missing.
The Windows driver passes this test because of guardband clipping.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-06 17:40:14 -08:00
Kenneth Graunke
ece0e535a4 i965: Always scissor on Gen6-7.5 instead of disabling guardband.
Previously we disabled the guardband when the viewport was smaller than
the framebuffer on Gen6-7.5, to prevent portions of primitives from
being draw outside of the viewport.  On Gen8+, we relied on the viewport
extents test to effectively scissor this away for us.

We can simply always enable scissoring instead.  We already include the
viewport in the scissor rectangle, so this will effectively do the
viewport extents test for us.  (The only difference is that the scissor
rectangle doesn't support sub-pixel values.  I think that's okay.)

Given that the viewport extents test is essentially a second scissor,
and is enabled for basically all 3D drawing on Gen8+, it stands to
reason that scissoring is cheap.  Enabling the guardband reduces the
cost of clipping, which is expensive.

The Windows driver appears to never disable guardband clipping, and
appears to use scissoring in this case.  I don't know if they leave
it on universally though.

This fixes misrendering in Blender, where the "floor plane" grid lines
started rendering at wrong angles after I disabled XY clipping of line
primitives.  Enabling the guardband seems to solve the issue.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99339
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-06 17:40:14 -08:00
Jason Ekstrand
f3c068c5c8 i965: Use a better guardband calculation.
(Patch co-authored by Jason and Ken.)

We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.

This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.

At best, getting this wrong would reduce the guardband's effectiveness
in some cases.  At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.

v2: drop clamping of positive maximums.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-06 17:40:14 -08:00
Kenneth Graunke
89ad7f1be6 i965: Combine the Gen6 SF and Clip viewport atoms.
The next patch will make the guardband calculation dependent on the
transformation matrix.  Instead of computing it in both atoms, just
combine them into a single atom.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-02-06 17:40:14 -08:00
Dave Airlie
90ac2285f0 radv: pass FMASK alignment to application
As was done for dcc and cmask.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-07 10:42:01 +10:00
Bas Nieuwenhuizen
47ca0f537d radv: Pass DCC alignment to application.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
2017-02-07 01:19:22 +01:00
Bas Nieuwenhuizen
eb01b20cc4 radv: Pass CMASK alignment to application.
CMASK alignment can be greater than image data alignment, so pass
it to the app so that it knows what alignment to backing memory
should have.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-07 01:18:53 +01:00
Dave Airlie
a864ef7f48 radv/ac: avoid the fmask path when doing txs.
This fixes the vulkan samples deferredmultisampling test.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-06 22:57:52 +00:00
Bruce Cherniak
11d6f836d0 swr: [rasterizer core] Removed unused clip code.
Removed unused Clip() and FRUSTUM_CLIP_MASK define.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-02-06 16:30:50 -06:00
Bruce Cherniak
bf29495dcd swr: [rasterizer core] Remove dead code Clipper::ClipScalar()
Clipper::ClipScalar() is dead code and should be removed.  It is causing
an error with gcc-7 because it references a now defunct member.

v2: includes bugzilla reference, same code change

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99633
CC: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2017-02-06 16:27:53 -06:00
Eric Anholt
72e6d1f00a gallium: Remove vc4 simulator hack from loader infrastructure.
Now that there's MESA_LOADER_DRIVER_OVERRIDE for choosing the driver name
we load, we don't need this any more.

v2: Get the junk out of pipe_loader_drm.c, too.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v2)
2017-02-06 12:44:06 -08:00
Eric Anholt
3f462050c2 loader: Add an environment variable to override driver name choice.
My vc4 simulator has been implemented so far by having an entrypoint
claiming to be i965, which was a bit gross.  The simulator would be a lot
less special if we entered through the vc4 entrypoint like normal, so add
a loader environment variable to allow the i965 fd to probe as vc4.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-06 12:44:06 -08:00
Eric Anholt
61bb1a9795 targets: Use a macro to reduce cut and paste in driver setup.
All the replicated prototypes/function bodies obfuscated the interesting
logic of the file: the mapping from driver enable macros to entrypoints we
expose, and the way that the swrast entrypoints are special compared to
the DRM entrypoints.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-06 12:44:06 -08:00
Dave Airlie
13a28ff236 radeon/ac: move common llvm build functions to a separate file.
Suggested by Marek.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-07 05:46:35 +10:00
Nicolai Hähnle
8822f4dfb9 eglmesaext: add new enums for EGL_MESA_drm_image_formats
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-02-06 17:41:28 +01:00
Nicolai Hähnle
6b0d390184 docs: add EGL_MESA_drm_image_formats extension proposal 2017-02-06 17:41:10 +01:00
Nicolai Hähnle
7be0e602ed dri/common: clear the loaderPrivate pointer in driDestroyDrawable
The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

We're not really following this language to the letter: some of the storage
is freed immediately (in particular, the dri3_drawable, which contains both
GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to
that freed storage; the previous patches added the corresponding NULL-pointer
checks.

This fixes memory corruption in piglit
./bin/glx-visuals-depth/stencil -pixmap -auto

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-06 17:39:44 +01:00
Nicolai Hähnle
f446f3fb33 glx: guard swap-interval functions against destroyed drawables
The GLX specification says about glXDestroyPixmap:

    "The storage for the GLX pixmap will be freed when it is not current
     to any client."

So arguably, functions like glXSwapIntervalMESA can be called after
glXDestroyPixmap has been called for the currently bound GLXPixmap.
In that case, the GLXDRIDrawable no longer exists, and so we just skip
those calls.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-06 17:39:30 +01:00
Nicolai Hähnle
21ec35566b glx/dri3: guard in_current_context against a disappeared drawable
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-06 17:39:10 +01:00
Nicolai Hähnle
40c304fc06 glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion
With a subsequent patch, we might see NULL loaderPrivates, e.g. when
a DRIdrawable is flushed whose corresponding GLXDRIdrawable was destroyed.
This resulted in a crash, since the loader vs. DRI3 drawable structures
have a non-zero offset.

Fixes glx-visuals-{depth,stencil} -pixmap

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-06 17:39:01 +01:00
Juan A. Suarez Romero
02264bc6f9 anv/pipeline: set ThreadDispatchEnable conditionally
Set 3DSTATE_WM/ThreadDispatchEnable bit on/off based on the same
conditions as used in the GL version.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-06 10:27:44 +01:00
Alejandro Piñeiro
dfb1b543f3 main/fboject: default_framebuffer allowed for GetFramebufferParameter
Before 4.5, the default framebuffer was not allowed for
GetFramebufferParameter, so it should return INVALID_OPERATION for any
call using the default framebuffer.

4.5 included new pnames, and some of them are allowed for the default
framebuffer. For the rest, INVALID_OPERATION. From OpenGL 4.5 spec,
section 9.2.3 "Framebuffer Object Queries:

   "An INVALID_OPERATION error is generated by GetFramebufferParameteriv
    if the default framebuffer is bound to target and pname is not one
    of the accepted values from table 23.73, other than
    SAMPLE_POSITION."

Fixes:
GL45-CTS.direct_state_access.framebuffers_get_parameter_errors

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-06 08:50:21 +01:00
Alejandro Piñeiro
0fb0c57b15 main/fbobject: implement new 4.5 pnames for GetFramebufferParameter
4.5 added new pnames allowed for GetFramebufferParameter, and
GetNamedFramebufferParameter.

From OpenGL 4.5 spec, section 9.2.3 "Framebuffer Object Queries" (quoting
the paragraph with only the new pnames, not all the supported):

   "pname may also be one of DOUBLEBUFFER,
    IMPLEMENTATION_COLOR_READ_FORMAT, IMPLEMENTATION_COLOR_READ_TYPE,
    SAMPLES, SAMPLE_BUFFERS, or STEREO, indicating the corresponding
    framebuffer-dependent state from table 23.73. Values of
    framebuffer-dependent state are identical to those that would be
    obtained were the framebuffer object bound and queried using the
    simple state queries in that table. These values may be queried
    from either a framebuffer object or a default framebuffer."

Fixes:
GL45-CTS.direct_state_access.framebuffers_get_parameters

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-06 08:50:21 +01:00
Alejandro Piñeiro
0cd2a4737e main/framebuffer: refactor _mesa_get_color_read_format/type
Current implementation returns the value for the currently bound read
framebuffer. GetNamedFramebufferParameteriv allows to get it for any
given framebuffer. GetFramebufferParameteriv would be also interested
on that method

It was refactored by allowing to pass a given framebuffer. If NULL is
passed, it used the currently bound framebuffer.

It also adds a call to _mesa_update_state. When used only by
GetIntegerv, this one was called as part of the extra checks defined
at get_hash. But now that the method is used by more methods, and the
update is needed, it makes sense (and it is safer) just calling it on
the method itself, instead of rely on the caller.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-06 08:50:21 +01:00
Dave Airlie
106a51440d radv: fix shared memory load/stores.
If we have an indirect index here we need to scale it by attribute slots
e.g. is this is vec2[256] then we get an indir_index in the 0.255 range
but the vec2 are aligned inside vec4 slots. So scale the indir index,
then extract the channels.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 19:53:03 +00:00
Dave Airlie
a1a8aef4c9 radv/ac: correctly size shared memory usage.
We count the number of slots used, but slots are vec4 sized,
so we have to scale by 16 not 4.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 19:52:13 +00:00
Dave Airlie
66463b7f75 radv: fix compute shared memory stores since 64-bit.
These regressed and caused doom to stop loading.

Fixes:
03724af26 radv/ac: Implement Float64 load/store var.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 19:51:52 +00:00
Brian Paul
023a9e3d92 docs: replace URL in features.txt
Replace unmaintained http://dri.freedesktop.org/wiki/MissingFunctionality
URL with http://mesamatrix.net/

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95460
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-03 12:02:38 -07:00
Brian Paul
2fac98f865 mesa: whitespace fixes in context.c
Remove trailing whitespace, replace tabs with spaces.  Trivial.
2017-02-03 11:48:25 -07:00
Nanley Chery
84dbf68378 anv/blorp: Disable resolves for transparent black clears
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-03 09:23:13 -08:00
Nanley Chery
93b819154f anv/cmd_buffer: Don't temporarily enable CCS_E within a render pass
Compressing a render target and decompressing it in the same
single-subpass render pass may waste bandwidth. While this may be
beneficial in some circumstances, it does not help in all. Reclaims
about 1.95% FPS for Dota 2 on some configurations.

v2 (Jason Ekstrand):
- Provide a more thorough comment
- Enable CCS_D for input attachments
v3 (Jason Ekstrand):
- Provide performance numbers

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-03 09:23:13 -08:00
Kenneth Graunke
3f064e9a40 mesa: Don't crash when destroying contexts created with no visual.
dEQP-EGL.functional.create_context.no_config tries to create a context
with no config, then immediately destroys it.  The drawbuffer is never
set up, so we can't dereference it asking if it's double buffered, or
we'll crash on a null pointer dereference.

Just bail early.

Applications using EGL_KHR_no_config_context could hit this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-02-03 08:55:02 -08:00
Samuel Pitoiset
af303abcdb winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()
cs can be NULL when it comes from r600_buffer_map_sync_with_rings()
to avoid doing the same checks. It was checked for write mappings
but not for read mappings.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-03 12:07:14 +01:00
Tapani Pälli
0a2dcd3a8a android: fix droid_create_image_from_prime_fd_yuv for YV12
Earlier changes introduced is_ycrcb flag which checks the component
order of u and v components. Condition for setting the flag was
incorrect, with ycrcb we are supposed to have cr before cb.

This patch (together with a fix in our gralloc) fixes corrupted
rendering from 'test-opengl-gl2_yuvtex' native test and corrupted
gallery thumbnail in application switcher on Android-IA.

Fixes: 51727b1cf5
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
2017-02-03 07:44:33 +02:00
Edward O'Callaghan
3879425917 ilo: EOL unmaintained older gallium intel driver
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:46 +11:00
Edward O'Callaghan
d77fa310ed ilo: EOL drop unmaintained gallium drv from buildsys
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:36 +11:00
Edward O'Callaghan
01b625ef1a ilo: EOL unplumb unmaintained gallium drv from winsys
This is no longer actively maintained and is just
accumulating bitrot.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Chia-I Wu <olvaffe@gmail.com>
2017-02-03 16:13:32 +11:00
Ilia Mirkin
2b4eaabff0 configure: libdrm is a single package
The intent of the libdrm_$driver version limits has always been to not
burden the "other" drivers with updating their libdrm unless really
necessary. Unfortunately the configure script erroneously only checked
the driver-specific bit and not the generic bit of libdrm as well. Fix
this.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-02 20:36:09 -05:00
Ilia Mirkin
7d3f9ed71c st/mesa: MAX_VARYING is the max supported number of patch varyings, not min
This fixes
GL45-CTS.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes
on nouveau. We only support 30 patch varyings (as 2 vec4 slots end up
being used for tess level settings), but were getting 32 exposed.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-02 20:28:58 -05:00
Ilia Mirkin
e73f87fcbd vbo: process buffer binding state changes on draw when recording
The VBO module keeps track of any vbo buffers. It updates this list when
receiving an InvalidateState call, however this never happens when
recording draws right now. Make sure that we do all the usual state
updates when recording draws so that the VBO list may be kept up to
date.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99631
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-02 20:28:27 -05:00
Dave Airlie
6cc3c46f58 radv/ac: move to using shared emit_ddxy code.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
c9a2fc3679 radeonsi/ac: move most of emit_ddxy to shared code.
We can reuse this in radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
278d5ef70a radv/ac: use shared thread id code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
c5f0a56aeb radeonsi/ac: move get thread id to shared code.
radv will use this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
1c5c268a8a radv/ac: migrate to using shared code for some load/store stuff.
This migrates to the code shared with radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
b3c28942c7 radeonsi/ac: move tbuffer store and buffer load to shared code.
These are all reuseable by radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Dave Airlie
a9773311f6 radeonsi/ac: move a bunch of load/store related things to common code.
These are all shareable with radv, so start migrating them to the
common code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:54:04 +10:00
Eduardo Lima Mitev
e198a64e35 texgetimage: Add check for the effective target to GetTextureSubImage
OpenGL 4.5 spec, section "8.11.4 Texture Image Queries", page 233 of
the PDF states:

    "An INVALID_OPERATION error is generated if texture is the name of a buffer
     or multisample texture."

This is currently not being checked and e.g a multisample texture image can
be passed down to the driver hook. On i965, it is crashing the driver with an
assertion:

intel_mipmap_tree.c:3125: intel_miptree_map: Assertion `mt->num_samples <= 1' failed.

v2: (Ilia Mirkin) Move the check from gettextimage_error_check() to
    GetTextureSubImage() and use the texObj target.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-03 00:43:46 +01:00
Marek Olšák
dfe111368d Revert "radeonsi: decrease the number of texture slots to 24"
This reverts commit bdd860e307.

Requested by a game developer.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-03 00:39:48 +01:00
Dave Airlie
b457f67495 configure.ac: explicitly require libdrm for dri classic drivers.
Although this might come from somewhere else require it explicitly.

Reviewed-by: Chad Versace <chadversary@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-03 09:38:15 +10:00
Jason Ekstrand
37a6f48ceb intel/isl: Add a better comment for format_supports_ccs_e
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
45b3eb4dfc anv: Remove the finishme for CCS_E with storage images
The data port can't handle CCS at all so replace the finishme with
better comments.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
fc9f0db8e3 intel/isl: Assert that we don't use CCS for storage images
I enabled CCS for storage images in the Vulkan driver and ran it through
the CTS.  It didn't result in any hangs but it demonstrated that the data
port cannot handle CCS.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
7e6a9d9c4b intel/isl: Add a formats_are_ccs_e_compatible helper
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
6142e3c07c intel/isl: Add a format_supports_ccs_d helper
Nothing uses this yet but it serves as a nice bit of documentation
that's relatively easy to find.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-02 13:33:43 -08:00
Jason Ekstrand
ab06fc6684 intel/isl: Rename supports_lossless_compression to supports_ccs_e
The term "lossless compression" could potentially mean multisample
color compression, single-sample color compression or HiZ because they
are all lossless.  The term CCS_E, however, has a very precise meaning;
in ISL and is only used to refer to single-sample color compression.
It's also much shorter which is nice.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-02 13:33:43 -08:00
Nanley Chery
043d92fef9 anv/pass: Store the depth-stencil attachment's last subpass index
Commit 968ffd6c86 stored the last subpass
index of all the attachments but that of the depth-stencil attachment.
This could cause depth buffers used in multiple subpasses not to be in
the requested final layout. Fix this error.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-02-02 10:36:14 -08:00
Nicolai Hähnle
a020cb3a72 gallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capability
Make the cap consistent with PIPE_CAP_INT64.

Aside from the hypothetical case of using draw for vertex shaders (and
actually caring about doubles...), every implementation supports doubles
either nowhere or everywhere.

Also, st/mesa didn't even check the cap correctly in all supported
shader stages.

While at it, add a missing LLVM version check for 64-bit integers in
radeonsi. This is conservative: judging by the log, LLVM 3.8 might be
sufficient, but there are probably bugs that have been fixed since then.

v2: fix clover (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-02-02 16:53:42 +01:00
Plamena Manolova
96123dbad9 mesa: Enable EXT_compressed_ETC1_RGB8_sub_texture
Since we already have the functionality in place and games
like Game of Thrones seem to depend on this extension, I
think it makes sense to enable it by making it part of
the extension string even though it's still a draft:

https://www.khronos.org/registry/gles/extensions/EXT/EXT_compressed_ETC1_RGB8_sub_texture.txt

Note: OES_compressed_ETC1_RGB8_sub_texture seems to be listed
in gl2ext.h, but there's no documentation for it in the KHR
registry

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-02 12:28:31 +00:00
Vinson Lee
6ee4665a77 configure: Only require libdrm 2.4.75 for intel.
Fixes: b8acb6b179 ("configure: Require libdrm >= 2.4.75")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-02-02 13:10:00 +10:00
Lionel Landwerlin
7158255069 anv: enable VK_KHR_shader_draw_parameters
Enables 10 tests from:

   dEQP-VK.draw.shader_draw_parameters.*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-02 01:33:16 +00:00
Lionel Landwerlin
9413e11869 anv: emit DrawID if needed
v2: use define for buffer ID (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-02 01:33:06 +00:00
Lionel Landwerlin
543d5db4e2 anv: always allocate a vertex element with vertexid or instanceid
Up to now on Gen8+ we only allocated a vertex element for
gl_InstanceIndex or gl_VertexIndex when a vertex shader uses
gl_BaseInstanceARB or gl_BaseVertexARB. This is because we would
configure the VF_SGVS packet to make the VF unit write the
gl_InstanceIndex & gl_VertexIndex values right behind the values
computed from the vertex buffers.

In the next commit we will also write the gl_DrawIDARB value. Our
backend expects to pull the gl_DrawIDARB value from the element
following the element containing gl_InstanceIndex, gl_VertexIndex,
gl_BaseInstanceARB and gl_BaseVertexARB (see
vec4_vs_visitor::setup_attributes). Therefore we need to allocate an
element for the SGVS elements as long as at least one of the SGVS
element is read by the shader. Otherwise our shader will use a
gl_DrawIDARB value pulled from the URB one element too far (most
likely garbage).

v2: Fix my english (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-02 01:32:58 +00:00
Lionel Landwerlin
289aef771d anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31
v2: use define for buffer ID (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-02 01:32:48 +00:00
Lionel Landwerlin
98cf60a3ce anv: limit vertex buffers to 31
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-02 01:32:39 +00:00
Mauro Rossi
9c45bb731c android: fix llvm, elf dependencies for M, N releases
These changes set the correct llvm version and elf include path
which differ for Marshmallow and Nougat

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-02-01 23:01:35 +00:00
Jason Ekstrand
ccdd5b3738 anv: Don't use bogus alpha swizzles
For RGB formats in Vulkan, we use the corresponding RGBA format with a
swizzle of RGB1.  While this swizzle is exactly what we want for
texturing, it's not allowed for rendering according to the docs.  While
we haven't been getting hangs or anything, we should probably obey the
docs.  This commit just sanitizes all render swizzles so that the alpha
channel maps to ALPHA.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 14:41:06 -08:00
Micah Fedke
752ae38a09 Add missing copyright header to wayland-egl-priv.h
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 22:33:40 +00:00
Dave Airlie
cda9f3d8ec radv: handle VK_QUEUE_FAMILY_IGNORED in image transitions (v3)
The CTS tests at least are using this, and we were totally
ignoring it.

This hopefully fixes the bouncing multisample CTS tests.

v2: get family mask in ignored case from command buffer.
v3: only change things in one place, use logic from Bas.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:25:04 +10:00
Dave Airlie
fa316ed02f radv/ac: handle clip/cull distance sizing in geometry shader outputs
Otherwise we were writing these as 4 components, and things went bad.

Fixes (the remaining):
dEQP-VK.clipping.user_defined.*.vert_geom.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:25:04 +10:00
Dave Airlie
230e308ff9 radv/ac: add const_index to fetch index for gs inputs
This fixes clip distance fetches as they are single item loads
with a const_index like float[1].

Fixes:
dEQP-VK.clipping.user_defined.*.vert_geom.[0-6]

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:25:04 +10:00
Dave Airlie
dc68b920df radeonsi/ac: move frag interp emission code to shared llvm code.
This code should be used in radv, so move it to a shared location
in advance of doing that.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:24:53 +10:00
Timothy Arceri
b940b2fd16 st/mesa: inline get_mesa_program()
In the past I've gotten this function confused with the one in
ir_to_mesa.cpp of the same name. Now that the affected flag setting
has move into a helper it makes sense just to inline this remaining
code.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-02 08:31:28 +11:00
Timothy Arceri
a7050ea1f9 st/mesa: create set_prog_affected_state_flags() helper
This will be used when restoring tgsi from the on-disk shader
cache.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-02 08:31:28 +11:00
Timothy Arceri
8d3d8a6d4e st/mesa: st_atom_shader.c C99 tidy up
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-02 08:31:28 +11:00
Timothy Arceri
f3e2428a7a st/mesa: remove pre C99 statement block for variable declaration
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-02-02 08:31:28 +11:00
Jason Ekstrand
0c114f2cf0 isl: Add assertions for render target swizzle restrictions
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 12:07:54 -08:00
Boyuan Zhang
f90ccf48bc st/va: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
d596bd29ec st/vdpau: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
c29191eea8 radeon/uvd: add h264 constrained baseline support
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Boyuan Zhang
22841ec84a vl: add h264 constrained baseline profile
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-02-01 14:32:32 -05:00
Bas Nieuwenhuizen
f5f8eb2c7c radv: Enable VK_KHR_shader_draw_parameters.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen
cf8a11c1ba radv: Pass draw index to shader.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-01 19:49:40 +01:00
Bas Nieuwenhuizen
80f4331ed1 radv/ac: Add draw index support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-01 19:49:40 +01:00
Robert Foss
25f2d3c1d3 i965: Prevent coverity warning
Add assert checking that num_sources is never larger than 3.

This prevents Coverity from concluding that the unhandled
cases of num_sources not being 0-3 are relevant.

Coverity-Id: 1399480-1399489
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-01 16:47:05 +00:00
Lionel Landwerlin
875b15eec4 spirv: add SPV_KHR_shader_draw_parameters support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 15:08:33 +00:00
Lionel Landwerlin
bd46040162 compiler: add missing enums for debug
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 15:08:30 +00:00
Emil Velikov
1e8fd790e1 docs: add news item and link release notes for 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 11:21:59 +00:00
Emil Velikov
f2391e8134 docs: add sha256 checksums for 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 6bfc352f5a)
2017-02-01 11:20:28 +00:00
Emil Velikov
7b6931e7fb docs: add release notes for 13.0.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 3255d10da4)
2017-02-01 11:20:27 +00:00
Michel Dänzer
31136eae3a winsys/radeon: Allow visible VRAM size > 256MB with kernel driver >= 2.49
The kernel driver reports correct values now.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-02-01 16:38:14 +09:00
Tapani Pälli
58828fe4ae android: add vulkan build for intel
fixes to issues spotted by Emil Velikov:

   - set ANV_TIMESTAMP corretly
   - fix typo with VULKAN_GEM_FILES

v2: update to use Makefile.sources under vulkan
    instead of having own

v3: update to changes to generate from vk.xml
    (commit c7fc310)

v4: remove 'hw' relative path
    cleanups, remove unnecessary cruft

    review from Emil Velikov:

    - move to vulkan folder
    - remove timestamp gen, no longer necessary
    - more cleanups

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-02-01 07:58:49 +02:00
Ilia Mirkin
62b8f494fa mesa: use same is_color_attachment trick to discern error cases
All the other calls to retrieve the attachment have been covered except
this one - return the proper error for attachment points that are valid
enums but out of bound for the driver.

Fixes GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_attachment

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 22:12:57 -05:00
Jason Ekstrand
92128590bc anv: Improve flushing around STATE_BASE_ADDRESS
It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
actually is.  We know from experimentation that we need to flush the
render cache prior to emitting STATE_BASE_ADDRESS and invalidate the
texture cache afterwards.  The only thing the PRM says is that, on gen8+
we're supposed to invalidate the state cache after STATE_BASE_ADDRESS
but experimentation has indicated that doing so does nothing whatsoever.

Since we don't really know, let's do just a bit more flushing in the
hopes that this won't be a problem again.  In particular:

 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't
    really know whether or not it's pipelined.

 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS
    is a compute shader.

 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS
    because the state may be getting cached there (we don't really know).

Reported-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-31 18:49:44 -08:00
Jason Ekstrand
f1f9794118 anv: Flush render cache before STATE_BASE_ADDRESS on gen7
We had no good reason for *not* doing this on gen7 before but we didn't
know it was needed.  Recently, when trying update to Vulkan CTS version
1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear
to be STATE_BASE_ADDRESS related.  This commit fixes them.

Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-31 18:49:44 -08:00
Jason Ekstrand
4871930451 isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell
This causes hangs on Broadwell if you try to render to it.  I have no
idea how we managed to not hit this earlier.

Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-31 18:49:44 -08:00
Jason Ekstrand
a0348b5a0b intel/blorp: Handle clearing of A4B4G4R4 on all platforms
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-31 18:49:44 -08:00
Tom Stellard
226a2c6d6e radeonsi: Fix build on LLVM < 3.9 v2
This was broken by: e0cc0a614c

v2:
  - Use preprocessor macro

Tested-by: Mark Janes <mark.a.janes@intel.com>
2017-02-01 02:10:00 +00:00
Bas Nieuwenhuizen
798ae37cc9 radv: Enable Float64 support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen
441ee1e65b radv/ac: Implement Float64 SSBO loads.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:34 +01:00
Bas Nieuwenhuizen
bb1ce63002 radv/ac: Implement Float64 UBO loads.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:29 +01:00
Bas Nieuwenhuizen
03724af262 radv/ac: Implement Float64 load/store var.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen
91074bb11b radv/ac: Implement Float64 SSBO stores.
No f16 support as I'm not quite sure about alignment yet.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:05 +01:00
Bas Nieuwenhuizen
29577b2123 radv/ac: Add core Float64 support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-01 01:09:05 +01:00
Rob Herring
01e18b21d1 vc4: Enable Neon on arm android builds
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:21 -08:00
Rob Herring
83107acb7b vc4: fix arm64 build with Neon
The addition of Neon assembly breaks on arm64 builds because the assembly
syntax is different. For now, restrict Neon to ARMv7 builds.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:19 -08:00
Rob Herring
6d92f32852 vc4: Make Neon inline assembly clang compatible
clang throws an error on "%r2" and similar. I couldn't find any
documentation on what "%r?" is supposed to mean and I've never seen any
use like that as far as I remember. The parameter is supposed to be
cpu_stride and just %2/%3 should be sufficient.

There's no need for trailing ";" either, so remove those, too.

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 14:06:09 -08:00
Tom Stellard
e0cc0a614c radeonsi: Set datalayout on the llvm module
This prevents LLVM from using sext instructions for local memory offsets
and allows the backend to fold immediate offsets into the instruction.

This also prevents some incorrect code generation for ptrtoint and
inttoptr instructions.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 20:39:30 +00:00
Francisco Jerez
11e9ebbf15 nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:33:33 -08:00
Francisco Jerez
013d40d1ce glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:33:33 -08:00
Francisco Jerez
7215375c44 nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
See "glsl: Rewrite atan2 implementation to fix accuracy and handling
of zero/infinity." for the rationale, but note that the instruction
count benefit discussed there is somewhat less important for the SPIRV
implementation, because the current code already emitted no control
flow instructions -- Still this saves us one hardware instruction per
scalar component on Intel SKL hardware.

Fixes the following Vulkan CTS tests on Intel hardware:

    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4

Note that most of the test-cases above expect IEEE-compliant handling
of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so
except for the last two the test-cases above weren't expected to pass
yet.  The reason they do is that the i965 back-end implementation of
the NIR fmin and fmax instructions is not quite GLSL-compliant (it
complies with IEEE 754 recommendations though), because fmin/fmax of a
NaN and a non-NaN argument currently always return the non-NaN
argument, which causes atan() to flush NaN to one and return the
expected value.  The front-end should probably not be relying on this
behavior for correctness though because other back-ends are likely to
behave differently -- A follow-up patch will handle the atan2(±∞, ±∞)
corner cases explicitly.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-31 10:33:27 -08:00
Francisco Jerez
e9ffd12827 glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
This addresses several issues of the current atan2 implementation:

 - Negative zero (and negative denorms which end up getting flushed to
   zero) isn't handled correctly by the current implementation.  The
   reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
   on which side of the branch cut the argument is, which causes us to
   return incorrect results (off by up to 2π) for very small negative
   values.

 - There is a serious precision problem for x values of large enough
   magnitude introduced by the floating point division operation being
   implemented as a mul+rcp sequence.  This can lead to the quotient
   getting flushed to zero in some cases introducing an error of over
   8e6 ULP in the result -- Or in the most catastrophic case will
   cause us to return NaN instead of the correct value ±π/2 for y=±∞
   and x very large.  We can fix this easily by scaling down both
   arguments when the absolute value of the denominator goes above
   certain threshold.  The error of this atan2 implementation remains
   below 25 ULP in most of its domain except for a neighborhood of y=0
   where it reaches a maximum error of about 180 ULP.

 - It emits a bunch of instructions including no less than three
   if-else branches per scalar component that don't seem to get
   optimized out later on.  This implementation uses about 13% less
   instructions on Intel SKL hardware and doesn't emit any control
   flow instructions.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-31 10:32:45 -08:00
Francisco Jerez
69042a5be4 i965/fs: Fix nir_op_fsign of absolute value.
This does point at the front-end emitting silly code that could have
been optimized out, but the current fsign implementation would emit
bogus IR if abs was set for the argument (because it would apply the
abs modifier on an unsigned integer type), and we shouldn't rely on
the upper layer's optimization passes for correctness.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-31 10:32:43 -08:00
Francisco Jerez
7ec3af3f8f glsl/ir_builder: Add rcp builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:32:43 -08:00
Francisco Jerez
6643a97de3 glsl: Fix constant evaluation of the rcp op.
Will avoid a regression in a future commit that introduces some
additional rcp operations.  According to the GLSL 4.10 specification:

"Dividing by 0 results in the appropriately signed IEEE Inf."

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:32:43 -08:00
Francisco Jerez
e81130d7a1 mesa/program: Translate csel operation from GLSL IR.
This will be used internally by the GLSL front-end in order to
implement some built-in functions. Plumb it through MESA IR for
back-ends that rely on this translation pass.

v2: Add comment.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:32:42 -08:00
Wladimir J. van der Laan
56314f5baf etnaviv: Set SE.CLIP registers, add margins for scissor/clip registers
This fixes rendering of full-screen quads (and other screen-filling
geometry, e.g. ioquake3 walls up-close) on gc3000. It should be a no-op
on other hardware.

- It looks like SE_CLIP registers were not set at all.
  I'm amazed that rendering worked without them. Emit them to
  avoid issues on gc3000.

- Define constants
  ETNA_SE_SCISSOR_MARGIN_RIGHT (0x1119)
  ETNA_SE_SCISSOR_MARGIN_BOTTOM (0x1111)
  ETNA_SE_CLIP_MARGIN_RIGHT (0xffff)
  ETNA_SE_CLIP_MARGIN_BOTTOM (0xffff)

  These demarcate the margin (fixp16) between the computed sizes and the
  value sent to the chip. I have set these to the numbers used by the
  Vivante driver for gc2000. I am not sure whether any old hardware was
  relying on the old numbers, or whether those were just a guess. But if
  so, these need to be moved to the _specs structure.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 19:29:23 +01:00
Wladimir J. van der Laan
fe3bb8cdb5 etnaviv: Generate new sin/cos instructions on GC3000
Shaders using sin/cos instructions were not working on GC3000.

The reason for this turns out to be that these chips implement sin/cos
in a different way (but using the same opcodes):

- Need their input scaled by 1/pi instead of 2/pi.

- Output an x and y component, which need to be multiplied to
  get the result.

- tex_amode needs to be set to 1.

Add a new bit to the compiler specs and generate these instructions
as necessary.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 19:29:16 +01:00
Nanley Chery
33e0c5d003 anv/cmd_buffer: Use the proper depth input attachment surface state
Commit 2852efcda4 moved the location of
the depth input attachment surface state from the render pass to the
image view, but failed to update the surface state location used when
emitting the binding table. Fix this by loading the surface state from
the correct location.

Fixes:
dEQP-VK.renderpass.formats.d16_unorm.input.*
dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.*
dEQP-VK.renderpass.formats.d32_sfloat.input.*
dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.*
dEQP-VK.renderpass.attachment_allocation.input_output.93
dEQP-VK.renderpass.attachment_allocation.input_output.92
dEQP-VK.renderpass.attachment_allocation.input_output.82
dEQP-VK.renderpass.attachment_allocation.input_output.46

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2017-01-31 09:00:50 -08:00
Bartosz Tomczyk
fc27181f9e glsl: fix heap-buffer-overflow
The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 15:58:52 +01:00
Wladimir J. van der Laan
658568941d etnaviv: Cannot render to rb-swapped formats
Exposing rb swapped (or other swizzled) formats for rendering would
involve swizzing in the pixel shader. This is not the case at the
moment, so reject requests for creating such surfaces.

(GPUs that need an extra resolve step anyway due to multiple pixel
pipes, such as gc2000, might also do this swap in the resolve operation.
But this would be tricky to keep track of)

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-31 09:28:28 +01:00
Christian Gmeiner
82fe240a99 etnaviv: Avoid infinite loop in find_frame()
Use of unsigned loop control variable with '>= 0' would lead
to infinite loop.

Reported by clang:

etnaviv_compiler.c:1024:39: warning: comparison of unsigned expression
>= 0 is always true [-Wtautological-compare]
   for (unsigned sp = c->frame_sp; sp >= 0; sp--)
                                   ~~ ^  ~

v2: Simply use the same datatype as c->frame_sp is using.

CC: <mesa-stable@lists.freedesktop.org>
Reported-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
2017-01-31 09:19:25 +01:00
Dave Airlie
8477aa71d9 radv/ac: apply slice rounding to 1d arrays as well.
Fixes:
dEQP-VK.glsl.texture_functions.texture.*1darray*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 11:13:15 +10:00
Dave Airlie
3882f3da22 radv/geom: check if esgs and gsvs ring exists before filling geom rings
There are some corner cases where you end up with an esgs ring, but no
gsvs ring, test for both before dereferencing.

Fixes:
dEQP-VK.geometry.emit.points_emit_0_end_0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 11:13:15 +10:00
Dave Airlie
723941bb3d radv: enable geometryShader and multiViewport capabilities.
This enables geometry shader support on radv.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:53 +10:00
Dave Airlie
ca822e1b7c radv: handle layer export from vs->fs properly
Fixes:
dEQP-VK.geometry.layered.1d_array.fragment_layer

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:49 +10:00
Dave Airlie
c9c8ae1fd3 radv: emit esgs itemsize register.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:46 +10:00
Dave Airlie
77ec78669a radv: handle prim id inputs to fragment shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:41 +10:00
Dave Airlie
105ce24d46 radv: emit geometry shaders to hardware
This emits the compiled geometry shader and other state registers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:37 +10:00
Dave Airlie
1fa5b755c2 radv: emit geometry ring size and pointers via preamble (v2)
This uses the scratch infrastructure to handle the esgs
and gsvs rings.

(this replaces the old code that did this with patching).

v2: fix correct ring sizes, reset sizes (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:19 +10:00
Dave Airlie
8f41fe4389 radv: add gs ring size calculations to pipeline.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:15 +10:00
Dave Airlie
99936d3606 radv: add pipeline creation support for geometry shaders (v2.1)
This adds gs copy shader support to the pipeline cache, and few
geometry related changes.

v2: rebase for spill changes.
v2.1: fix incorrect pipeline destruction.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:10 +10:00
Dave Airlie
fd4ea9e62d radv/ac: handle primitive id
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:08 +10:00
Dave Airlie
4ec294adce radv/ac: handle emitting vertex outputs to esgs ring.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:05 +10:00
Dave Airlie
ac642c6195 radv/ac: handle gs inputs
This handles geometry shader inputs written by the vertex (es) shader
to the esgs ring.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:01 +10:00
Dave Airlie
80cdf2c17e radv/ac: add geom input support to get deref offset.
This just adds the API and fixes up the callers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:59 +10:00
Dave Airlie
23999a363b radv/ac: handle invocation and primitive id intrinsics
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:55 +10:00
Dave Airlie
63fa6c6eb4 radv/ac: handle geometry emit vertex and end prim intrinsics.
This handles emitting things to the gsvs ring, and sending the
correct GS msgs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:52 +10:00
Dave Airlie
2a56186d57 radv/ac: handle emitting gs epilogue
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:48 +10:00
Dave Airlie
a615a01942 radv/ac: add copy shader creation
This create the gs copy shader and compiles it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:40 +10:00
Dave Airlie
09cd037ca4 radv/ac: setup function parameters for vs as es and copy shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:33 +10:00
Dave Airlie
e1e9301b2a radv: pass some necessary gs info back to state handling.
We need this info to program some registers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:30 +10:00
Dave Airlie
68a77411e1 radv: emit vertex shader to correct hw block.
This emits the shader to the ES block in the correct case.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:27 +10:00
Dave Airlie
2a57bddd4c radv/ac: propogate as_es flag into shader info from key.
This just places the flag into the shader info so we can use it from
the driver after we create the shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:23 +10:00
Dave Airlie
b941a88e01 radv: extend shader stage code to cover geometry shaders.
This enables the paths for setting up user ptrs to vs/es and gs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:20 +10:00
Dave Airlie
ec7bf863d2 radv/ac: start setting up the geom shader rings (v2)
This sets up the rings and adds the variables
needed to make them work.

v2: rework for sharing ring and scratch
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:17 +10:00
Dave Airlie
ca91db2402 radv/ac: handle geom shader sgpr/vgpr inputs
This just sets up the gpr inputs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:13 +10:00
Dave Airlie
374e978438 radv/ac: add geom shader sendmsg defines.
This just adds some defines needed for geom shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:10 +10:00
Dave Airlie
583cf8efd4 radv/ac: add some geom shader info from nir->ac shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:50 +10:00
Dave Airlie
ecb8a34910 radv: move hw vertex shader emit to separate function
This is to later allow ES shaders to be emitted.

Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:46 +10:00
Dave Airlie
3b507855cb radv: fixup ia multi vgt param code to handle geom shaders.
This fixes up a few of the commented out blocks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:28 +10:00
Dave Airlie
68c5da7e66 radv: add code to set gs_table_depth.
Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:24 +10:00
Dave Airlie
f26fa879b7 radv: add small helper to denote when a geom shader is in the pipeline.
Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:13 +10:00
Robert Foss
0b63f47030 radv: Prevent Coverity warning
Prevent Coverity seeing potential errors when src is
no initialized in the switch case.

Coverity-Id: 1396397
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 23:59:22 +01:00
Timothy Arceri
30aa22dec0 mesa: add new MESA_GLSL flag for printing shader cache debug info
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:31 +11:00
Carl Worth
ba1eb854bd glsl: add cache to ctx and add sha1 string fields
We also add a flag for detecting shaders written to shader cache.

V2: dont leak cache

Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:30 +11:00
Carl Worth
b8cb1a05cd glsl: add new uniform fields to be used to restore state from cache
Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:30 +11:00
Carl Worth
0f60c6616e glsl: Switch to disable-by-default for the GLSL shader cache
The shader cache is expected to be developed incrementally over a
fairly long series of commits. For that period of instability, we
require users to opt into the shader cache by setting:

	MESA_GLSL_CACHE_ENABLE=1

In the future, when the shader cache is complete, we can revert this
commit so that the cache will be on by default.

The user can always disable the cache with
MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this
commit, (nor will it be affected by the future revert).

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:30 +11:00
Dave Airlie
0ecd426490 radv/ac: implement txs for buffer textures.
This fixes a bunch of buffer related:
dEQP-VK.memory.pipeline_barrier.*
tests, that were crashing in LLVM due to this being missing.

Reviewed-by: Andres Rodriguez<andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 06:26:53 +10:00
Dave Airlie
ecc3fa3ba3 radv/ac: handle nir irem opcode.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opsrem.*

Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org"
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 05:38:57 +10:00
Dave Airlie
059dd17175 radv/ac: fix multisample subpass image.
We weren't adding the fragment position properly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 04:44:59 +10:00
Dave Airlie
a1c1ba7d56 radv: handle transfer_write as a dst flag.
It appears we can get image barriers like:
    srcStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dstStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dependencyFlags:                VkDependencyFlags = 0
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = NULL
    imageMemoryBarrierCount:        uint32_t = 1
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ffc882367b0
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ffc882367b0:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x2df55e0
            subresourceRange:               VkImageSubresourceRange = 0x7ffc882367e0:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 1
                baseArrayLayer:                 uint32_t = 0
                layerCount:                     uint32_t = 1

This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 04:42:21 +10:00
Samuel Pitoiset
af7fef12f7 r600: fix a compilation warning in r600_screen_create()
Should be r600_common_screen instead of r600_screen.

Fixes: 80157a2c20 ("gallium/radeon: clean up r600_query_init_backend_mask")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 18:13:18 +01:00
Marek Olšák
f8bc628b2c gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter
to simplify things in draw_vbo a little

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
75c425e511 winsys/radeon: clamp vram_vis_size to 256MB
the value from the kernel is wrong

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
eba9e9dd1d radeonsi: handle count_from_stream_output in a few IA_MULTI_VGT_PARAM cases
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
a0740d59aa radeonsi: don't invoke DCC decompression in update_all_texture_descriptors
This fixes a bug uncovered by the 17-part patch series, specifically:
  "gallium/radeon: merge dirty_fb_counter and dirty_tex_descriptor_counter"

If dirty_tex_counter has been updated and set_shader_image invokes DCC
decompression, the DCC decompression itself checks the counter and updates
descriptors, which in turn invokes the same DCC decompression. The blitter
can't handle the recursion and the driver eventually crashes.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:45:29 +01:00
Marek Olšák
f8dd2f5bac radeonsi: fold info->indirect conditionals into the last one in draw_vbo
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00
Marek Olšák
408f9a1584 radeonsi: atomize the scratch buffer state
The update frequency is very low.

Difference: Only account for the size when allocating a new one and when
            starting a new IB, and check for NULL. (v3)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 17:29:36 +01:00
Bartosz Tomczyk
a41f2527ae r600: Fix stack overflow
Commit 7b5878ee04 increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.

Cc: "12.0 13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 15:30:03 +01:00
Samuel Pitoiset
e2c15ea092 gallium/radeon: add new HUD queries for monitoring the CP
There are even more counters in the CP_STAT register but I think
these ones are enough for now.

v2: only read (and expose) CP_STAT on VI+

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 14:37:00 +01:00
Samuel Pitoiset
0e04a078c5 gallium/radeon: add new GPU-sdma-busy HUD query
For simplicity, GPU-sdma-busy will return 0 on previous gens.

v2: only read SRBM_STATUS2 on Evergreen+

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 14:37:00 +01:00
Samuel Pitoiset
b0f7ddef4f gallium/radeon: rename grbm to mmio in the gpu load path
We also want to monitor other MMIO counters like SRBM_STATUS2 in
order to know if SDMA is busy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 14:37:00 +01:00
Marek Olšák
2fc5fe0e85 winsys/amdgpu: add a fast exit path into amdgpu_cs_add_buffer
The time spent in the function dropped by 37% for torcs.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:57:09 +01:00
Samuel Pitoiset
86eb52adad winsys/amdgpu: do not iterate twice when adding fence dependencies
The perf difference is very small, 3.25->2.84% in amdgpu_cs_flush()
in the DXMD benchmark.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 13:44:25 +01:00
Samuel Pitoiset
5a6b1aadea winsys/amdgpu: add one likely() call in amdgpu_cs_flush()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 13:44:19 +01:00
Samuel Pitoiset
db2b0210b1 hud: fix compilation warnings in hud_nic_graph_install()
v2: use PRId64 instead of PRIx64

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-30 13:43:30 +01:00
Samuel Pitoiset
0b646ad05e st/mesa: make st_texture_get_sampler_view() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:42:50 +01:00
Marek Olšák
62732ce263 gallium/radeon: remove r600_common_context::max_db
this cleanup is based on the vulkan driver, which seems to do the same thing

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
9327780da6 winsys/amdgpu: fix ADDR_REGISTER_VALUE::backendDisables
This would be a fix if the value was used anywhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
80157a2c20 gallium/radeon: clean up r600_query_init_backend_mask
This just needs to be done for r600g in the screen.
We don't need an IB submission for every new context created for GCN.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
5f99c49008 radeonsi: precompute IA_MULTI_VGT_PARAM values into a table
The perf difference is very small: 0.99% -> 0.40% for the time spent
in si_get_ia_multi_vgt_param when si_draw_vbo is 20%. Pretty much nothing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
c78177fc64 radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
ccecf79c2b radeonsi: state atom IDs don't have to be off by one
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
ac059f1c23 radeonsi: use a bitmask for looping over dirty PM4 states
also move it to draw_vbo, because it should be 0 in most cases

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
802fcdc0d2 radeonsi: atomize L2 prefetches
to move the big conditional statement out of draw_vbo

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
c99ba3eb47 radeonsi: unbind disabled shader stages to prevent useless L2 prefetches
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
4a4ff66dbe radeonsi: also prefetch compute shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
879c73fac8 radeonsi: update dirty_level_mask only after the first draw after FB change
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
cecc068774 gallium/radeon: allow VRAM-only placements again on APUs & recent amdgpu
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
0d0f357de6 radeonsi: don't set +fp64-denormals
it's the default and the name will change to +fp64-fp16-denormals.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Marek Olšák
b177162489 radeonsi: remove si_shader_context::param_tess_offchip
we don't use on-chip tess.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-30 13:27:14 +01:00
Lucas Stach
e158b74971 etnaviv: force vertex buffers through the MMU
This fixes a vertex data corruption issue if some of the vertex streams
go through the MMU and some don't.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2017-01-30 12:40:57 +01:00
Andres Rodriguez
33f418bd67 radv: Expose VK_KHR_maintenance1
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:44:11 +01:00
Andres Rodriguez
7b890a36df radv: Fix vkCmdCopyImage for 2d slices into 3d Images
Previously the z offset of the destination image was being ignored. It
should be taken into account when copying into a 3d target.

Also, img_extent_el.depth was being incorrectly clamped to 1 due to the
source image being VK_IMAGE_TYPE_2D. This would result in the blit
failing to iterate over all the 3d slices. Instead we clamp to the
destination image type.

Fixes failures in CTS tests:
dEQP-VK.api.copy_and_blit.image_to_image.3d_images.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:44:07 +01:00
Bas Nieuwenhuizen
4eae3597eb radv: Expose transfer format features.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-30 08:42:26 +01:00
Bas Nieuwenhuizen
34bfe4b1bb radv: Don't allow any operations on non-supported depth/stencil formats.
We really use the depth block for the blits.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 08:42:26 +01:00
Andres Rodriguez
f8d5e1ab2d radv: use new error codes for AllocateDescriptorSets
There is a new error code in Maintenance1 that is more specific to the
situation: VK_ERROR_OUT_OF_POOL_MEMORY_KHR

Fixes CTS test case:
dEQP-VK.api.descriptor_pool.out_of_pool_memory

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:42:17 +01:00
Andres Rodriguez
e199a993b2 radv: vkAllocateCommandBuffers should NULL all output handles
This is part of the spec and fixes CTS tests:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:38:13 +01:00
Andres Rodriguez
ec0f5c005c radv: add trim command pool stub
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:37:54 +01:00
Kenneth Graunke
2f7a7ae131 i965: Support the force_glsl_version driconf option.
Gallium drivers have had this for a while.  It makes sense to support
it consistently across drivers, so expose it in i965 as well.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-29 18:20:57 -08:00
Kenneth Graunke
02216a1ddf i965: Fix check for negative pitch in can_do_fast_copy_blit().
At this point, the pitch is in bytes.  We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K.  This
means the bit 15 trick won't work.

The caller now has signed integers anyway, so just pass those through
and do the obvious check.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-29 18:20:35 -08:00
Bas Nieuwenhuizen
c4d7b9cd29 radv: Handle command buffers that need scratch memory.
v2: Create the descriptor BO with CPU access.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 02:07:20 +01:00
Bas Nieuwenhuizen
ccff93e138 radv: Track scratch usage across pipelines & command buffers.
Based on code written by Dave Airlie.

Signed-off-by: Bas Nieuwenhuizen <basni@oogle.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 02:07:16 +01:00
Bas Nieuwenhuizen
29c1f67e9f radv/ac: Add compiler support for spilling.
Based on code written by Dave Airlie.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 02:07:12 +01:00
Bas Nieuwenhuizen
d115b67712 radv/amdgpu: Support a preamble CS.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 02:07:08 +01:00
Timothy Arceri
2842dea310 i965: add assert to while_jumps_before_offset()
jip should always be negative here as its the result of
do instruction - while instruction.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-30 10:17:54 +11:00
Timothy Arceri
77a6597bb7 i965: fix up asserts in brw_inst_set_jip()
We are casting from a signed 32bit int to an unsigned 16bit int
so shift 15 bits rather than 16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-30 10:17:46 +11:00
Bas Nieuwenhuizen
b8ee45ebdc llvmpipe: Use LLVMDumpModule, not DumpModule.
Forgot the prefix ...

Fixes: 0fca80b3db
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
2017-01-29 17:03:25 +01:00
Bas Nieuwenhuizen
0fca80b3db various: Fix missing DumpModule with recent LLVM.
Since LLVM revision 293359 DumpModule gets only implemented when
either a debug build or LLVM_ENABLE_DUMP is set.

This patch adds a direct replacement for the function for radv and
radeonsi, However, as I don't know a good place to put common LLVM
code for all three I inlined the implementation for LLVMPipe.

v2: Use the new code for LLVM 3.4+ instead of LLVM 5+ & fixed indentation

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-01-29 10:25:00 +01:00
Ilia Mirkin
ce7a045fee r600g: use ieee variants of multiplication instructions
This matches the behavior of most other drivers, including nouveau,
radeonsi, and i965.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-29 00:00:07 -05:00
Ilia Mirkin
bacbb01105 r600g: add support for optionally using non-IEEE mul ops
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-28 23:59:43 -05:00
Eric Anholt
5b7e2697dc vc4: Coalesce into TLB writes as well as VPM/tex.
This generally cuts an instruction when blending is enabled and we thus
have a single instruction generating the color value.

total instructions in shared programs: 91759 -> 91634 (-0.14%)
instructions in affected programs:     5338 -> 5213 (-2.34%)
2017-01-28 19:35:20 -08:00
Eric Anholt
c1299615fb vc4: Avoid an extra temporary and mov in ffloor/ffract/fceil.
shader-db results:

total instructions in shared programs: 92611 -> 91764 (-0.91%)
instructions in affected programs:     27417 -> 26570 (-3.09%)

The star is one shader in glmark2's terrain (drops 16% of its
instructions), but there are also wins in mupen64plus and glb2.7.
2017-01-28 19:35:20 -08:00
Eric Anholt
0079df0b2d vc4: Flip the switch to run the GLSL compiler optimization loop once.
This has almost no effect on shader-db:

total instructions in shared programs: 92572 -> 92611 (0.04%)
instructions in affected programs:     4486 -> 4525 (0.87%)

Looking at 2 of the 7 different shaders that were hurt (all of which were
in mupen64), they all appear to be just differences in order of
instructions at the NIR level.

The advantage is that this should significantly reduce time in the compiler.
2017-01-28 19:35:20 -08:00
Kenneth Graunke
7c5629a269 i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.
Applications may delete a shader program, create a new one, and bind it
before the next draw.  With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program.  In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:

      if (brw->vertex_program != ctx->VertexProgram._Current) {
         brw->vertex_program = ctx->VertexProgram._Current;
         brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
      }

Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on.  This causes utter chaos.

As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them.  Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.

The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none.  This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time.  Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...

Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.

Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.

Cc: "17.0 13.0 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-27 21:52:37 -08:00
Bas Nieuwenhuizen
96c60b7f07 radv/ac: Use base in push constant loads.
Apparently the source is not an address but an offset, so we actually
need to use the base.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-01-28 03:07:39 +01:00
Andres Rodriguez
e8047980d2 radv: drop support for VK_AMD_NEGATIVE_VIEWPORT_HEIGHT
This extension was not correctly supported, and it conflicts with the
VK_KHR_MAINTENANCE1 spec.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-28 11:02:35 +10:00
Dave Airlie
e9b16c74fa radv: implement VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-28 10:52:23 +10:00
Dave Airlie
989ec61703 radv: use proper maximum slice for layered view
this fixes deferred shadows with geom shaders enabled.

but I think this fix is fine by itself.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-28 10:52:20 +10:00
Chad Versace
6403e37651 i965/sync: Implement fences based on Linux sync_file
This patch implements a new type of struct brw_fence, one that is based
struct sync_file.

This completes support for EGL_ANDROID_native_fence_sync.

* Background

  Linux 4.7 added a new file type, struct sync_file. See

    commit 460bfc41fd52959311ed0328163f785e023857af
    Author:  Gustavo Padovan <gustavo.padovan@collabora.co.uk>
    Date:    Thu Apr 28 10:46:57 2016 -0300
    Subject: dma-buf/sync_file: de-stage sync_file headers

  A sync file is a cross-driver explicit synchronization primitive. In a
  sense, sync_file's relation to synchronization is similar to dma_buf's
  relation to memory: both are primitives that can be imported and
  exported across drivers (at least in theory).

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:07 -08:00
Chad Versace
0b6dd31d68 i965/sync: Rename brw_fence_insert()
Rename to brw_fence_insert_locked(). This is correct because the fence's
mutex is effectively locked, as all callers are also *creators* of the
fence, and have not yet returned the new fence.

This reduces noise in the next patch, which defines and uses
brw_fence_insert(), an unlocked variant.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:07 -08:00
Chad Versace
a5c17f5c29 i965/sync: Fail sync creation when batchbuffer flush fails
Pre-patch, brw_sync.c ignored the return value of
intel_batchbuffer_flush().

When intel_batchbuffer_flush() fails during eglCreateSync
(brw_dri_create_fence), we now give up, cleanup, and return NULL.

When it fails during glFenceSync, however, we blindly continue and hope
for the best because there does not exist yet a way to tell core GL that
sync creation failed.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:07 -08:00
Chad Versace
014d0e0f88 i965/sync: Add brw_fence::type
This a refactor patch; no expected changed in behavior.

Add `enum brw_fence_type` and brw_fence::type. There is only one type
currently, BRW_FENCE_TYPE_BO_WAIT. This patch reduces a lot of noise in
the next, which adds new type BRW_FENCE_TYPE_SYNC_FD.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:06 -08:00
Chad Versace
d1ce499dae i965: Add intel_batchbuffer_flush_fence()
A variant of intel_batchbuffer_flush() with parameters for in and out
fence fds.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:06 -08:00
Chad Versace
358661c794 i965: Add intel_screen::has_fence_fd
This bool maps to I915_PARAM_HAS_EXEC_FENCE_FD.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:06 -08:00
Chad Versace
b8acb6b179 configure: Require libdrm >= 2.4.75
Required to implement EGL_ANDROID_native_fence_sync on i965.
Specifically, i965 needs drm_intel_gem_bo_exec_fence(),
I915_PARAM_HAS_EXEC_FENCE, and libsync.h.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 13:10:06 -08:00
Emil Velikov
cb6be5c8c0 configure.ac: list radeon in --with-vulkan-drivers help string
Analogous to what we do for the dri and gallium drivers.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@colllabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-27 19:25:30 +00:00
Emil Velikov
6f2dec0a23 radv: automake: Don't install vk_platform.h or vulkan.h.
These files belong to the vulkan loader.

Identical to
045f38a507 vulkan: Don't install vk_platform.h or vulkan.h.

Cc: Dave Airlie <airlied@redhat.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-27 19:25:26 +00:00
Jason Ekstrand
d96ade1c4c anv: Advertise API version 1.0.39
I'm pretty sure we've kept up with the bug fixes.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-01-27 10:06:14 -08:00
Eric Engestrom
5f301fe2e6 gbm/dri: fix memory leaks in error path
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
[Emil Velikov: make sure it builds]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:58 +00:00
Emil Velikov
1d104f9aa7 docs/releasing: add a note about the relnotes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-27 17:56:58 +00:00
Emil Velikov
2e076af067 mesa: remove explicit __STDC_FORMAT_MACROS define
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
1cfe97ff0e nouveau: remove explicit __STDC_FORMAT_MACROS define
Already handled by the build.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
027e04932a scons: swr: remove explicit __STDC_.*_MACROS defines
Analogous to previous commits.

Cc: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
e809fadb86 gallium: remove explicit __STDC_.*_MACROS defines
Analogous to previous commits.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
01e28c6cf5 gallivm: remove explicit __STDC_.*_MACROS defines
Correctly handled by the build systems.

Cc: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
74a174e12f glsl: remove explicit __STDC_FORMAT_MACROS define
Correctly handled by all the build systems.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
9e9e917c26 autoconf: set all __STDC_*_MACROS
Analogous to previous commit(s), with a minor detail - here we set the
macros when building both C and C++ sources.

Resolving that is a more challenging task that we'll sort out another
day.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
d68ffa9446 scons: always set __STDC_*_MACROS for C++ sources
Analogous to previous commit - just set the lot once throughout.

Cc: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
13e2928d57 android: always set __STDC_*_MACROS for C++ sources
Various parts of the code depend on the macros being defined.

Just set those unconditionally, only where needed (c++ sources) so that
we can drop the workarounds through the code.

Cc: Rob Herring <robh@kernel.org>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
c4862fa382 st/xa: automake: remove duplicate -Wall
Already handled by configure.ac

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
6a5850b04a mesa: move variable declaration to where its used
The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.

Fixes: 9f8dc3bf03 "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
01874d5278 st/mesa: use correct return statement for a void function
Analogous to previous commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
c1960e23ff mesa: use correct return statement for a void function
Using return foo() is incorrect even if foo itself returns void.
Spotted by AppVeyor, as below:

teximage.c(3653) : warning C4098: 'copyteximage' : 'void' function returning a value

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
be3b5e015c svga: remove const qualifier from SVGA3D_vgpu10_GenMips() prototype
Does not match the function definition or how it's used. Triggers the
following warning in AppVeyor

svga_cmd_vgpu10.c(1301) : warning C4028: formal parameter 2 different from declaration

Cc: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
cf00cc72e9 nir: add extra const notation in compare_blocks()
MSVC warns about different const qualifiers. Add the extra const to
silence it.

nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers
nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
a2dea3b654 nir: silence implicit conversion to 64bit
MSVC warns about implicit conversion as below. Annotate the literal
appropriately to silence the warning.

nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-27 17:56:56 +00:00
Emil Velikov
01849ae0dc i915, i965: automake: remove NA include directive
The path in question (... dri/intel/server) was removed years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
091f2b8c98 mesa/tests: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
6ba96bdcab dri/osmesa: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
ede4ff9adc dri/swrast: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
5a0ba1e5de radeon, r200: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
ee5de93269 mapi: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
af860850a0 loader: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
912b4f5472 glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
5b874cee09 glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
d66f9e6d93 glx: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
d221bf9b91 d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
517f34b4be st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
02f991c00d clover: automake: remove -I$(srcdir)
Already implicitly handled by the build system.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Aaron Watry <awatry@gmail.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
65d5a60cac clover: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
c5921ae0d2 egl: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
90ac5c339e i915: automake: include builddir prior to srcdir
Analogous to previous commit.

Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
4622c75dfb i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-01-27 17:56:55 +00:00
Emil Velikov
a922c82125 freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.

Fixes: 4610e5ef28 "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
2017-01-27 17:56:55 +00:00
Emil Velikov
5eed48d237 i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.

Fixes: bfd17c76c1 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-27 17:56:54 +00:00
Eric Engestrom
1ee2ae8348 anv: add missing extension errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:23:32 +00:00
Eric Engestrom
86879bf4ed anv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:23:32 +00:00
Lionel Landwerlin
ba26c79157 anv: don't assert on out of memory descriptor pool in debug mode
Fixes:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-01-27 17:23:32 +00:00
Eric Engestrom
4da0d1c59a docs/repository: fix name of main branch
This is git, not svn :P

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-27 16:52:23 +00:00
Eric Engestrom
87619a1a6a egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream
EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-01-27 16:52:23 +00:00
Eric Engestrom
a98b3a0872 egl: update headers from registry
Khronos introduced a new macro (suggested by Google) to avoid using
C-style casts in C++ code, as those generate warnings.

Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-01-27 16:52:23 +00:00
Eric Engestrom
06842585df radv: add missing extension errors in vk_errorf()
v2(Bas): Remove the extra VK_ERROR_FRAGMENTED_POOL cases.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-27 17:33:05 +01:00
Eric Engestrom
43cf967512 radv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-27 17:33:05 +01:00
Andreas Boll
1f2a890ace configure.ac: Require LLVM for r300 only on x86 and x86_64
b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.

r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before b3119a3.

Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-27 12:31:17 +01:00
Nicolai Hähnle
c5e76a262a gallium: enable int64 on radeonsi, llvmpipe, softpipe
All of these have had support for the TGSI opcodes since before most of
the glsl compiler work landed.

Also update the docs accordingly, including the missing note about i965.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:48 +01:00
Dave Airlie
93dc5c1a06 st/mesa: add support for enabling ARB_gpu_shader_int64.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:43 +01:00
Dave Airlie
278580729a st/glsl_to_tgsi: add support for 64-bit integers
v2: add conversion opcodes.

v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with
TGSI_OPCODE_I2I64.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v5 (nha): add clarifying comment about a subtle assumption

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:39 +01:00
Dave Airlie
f804506d4d gallium: Add integer 64 capability
v1.1: move to using a normal CAP. (Marek)

v2: fill in the cap everywhere

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-27 10:19:25 +01:00
Topi Pohjolainen
a283a4ee2f meta: Refactor texture format translation
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
542bb85049 intel/blorp/dbg: Name blit shaders for easy recognition in dumps
Blorp clears already have an equivalent.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
56094cfb9e i965/hiz/gen6: Stop setting false qpitch
which is not applicable for "all slices at each lod". Current
logic makes one to believe it has some purpose. When miptree
layout is calculated brw_miptree_layout_texture_array() sets
the qpitch unconditionally but later on ignores it altogether
for ALL_SLICES_AT_EACH_LOD.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
b13d30a72b i965/blorp/gen6: Remove dead code in hiz setup
Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt
only exists for gen6. In addition, intel_hiz_miptree_buf_create()
uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
b864e3d7ee i965/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. The same goes for
intel_miptree_aux_buffer::pitch/qpitch.

This will make following patches simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
40bf622ced i965/blorp/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. Also intel_miptree_aux_buffer::offset
is initialised to zero (calloc()).

This will make following patches significantly simpler to read.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
5201d2991b i965/gen6: Remove check for stencil format
There are is no alternative.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
19412abb3f i965: Remove check for hiz on earlier gens than SNB
Only caller, brw_workaround_depthstencil_alignment(), returns
early for gen6+.

While at it, reduce scope for brw_get_depthstencil_tile_masks() as
well.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
26a9e039fd i965/miptree: Remove redundant check for null texture
There exact same check earlier in brw_miptree_layout() which
intel_miptree_create_layout() in turn calls unconditionally.

Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:26 +02:00
Topi Pohjolainen
bcec4113cc i965/miptree: Tell when brw_miptree_layout() fails
In addition, let intel_miptree_create_layout() release the
miptree - it is the allocator.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:25 +02:00
Topi Pohjolainen
aa9e21a316 i965/meta: Remove unused brw_get_rb_for_slice()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gons<C3><A1>lvez <siglesias@igalia.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-27 08:57:25 +02:00
Michel Dänzer
d9f8bae616 clover: Fix build against clang SVN >= r293097
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2017-01-27 09:53:14 +09:00
Eric Anholt
9baf1ff8fc vc4: Use NEON to speed up utile stores on Pi2+.
Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).
2017-01-26 12:50:05 -08:00
Eric Anholt
4d30024238 vc4: Use NEON to speed up utile loads on Pi2.
We had a lot of memcpy call overhead because gpu_stride wasn't being
inlined.  But if you split out the stride==8 and stride==16 cases like
this code does while still using memcpy, you'd no longer have glibc's
NEON memcpy applied at which point we'd be doing 16 uncached reads
instead of 64/(NEON memcpy granularity), for about a 30% performance
hit.  By hand writing the assembly, we can get a whole cacheline
loaded at a time.

Unfortunately, NEON intrinsics turned out to be unusable -- they
didn't have the vldm instruction available.

Note that, for now, the NEON code is only enabled when building for ARMv7
(Pi 2+).  We may want to do runtime detection for the Raspbian case, in
the future.

Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).
2017-01-26 12:48:10 -08:00
Eric Anholt
347b69e7d7 vc4: Move LT tiling code to a separate file.
This paves the way for building it twice, with NEON assembly or not.
2017-01-26 12:23:31 -08:00
Eric Anholt
14cf5c60b8 vc4: Use unreachable() in an unreachable codepath for tiling. 2017-01-26 12:23:31 -08:00
Samuel Pitoiset
eca96ea308 gallium/radeon: add VRAM-vis-usage HUD query
This new query returns the current visible usage of VRAM accessed
by the CPU. It will return 0 on radeon because it's unimplemented.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-26 19:40:52 +01:00
Samuel Pitoiset
9f087e1c7c gallium/radeon: query the CPU accessible size of VRAM
R600_DEBUG="info" can be used to display that size, as well as
the total amount of VRAM/GTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-26 19:40:14 +01:00
Ian Romanick
13439031c8 mesa: Arrange validate_uniform_parameters parameters to match call sites
Saves a measly 20 bytes on IA32 and nothing on x64.  Depending on
exactly when this is applied, a lot of variation is possible due to
function alignment.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670111	 228340	  22552	6921003	 699b2b	lib/i965_dri.so after
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so before
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so after

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:18 -08:00
Ian Romanick
9be5fd3c87 mesa: Arrange _mesa_uniform parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64.  On
IA32, the details of the instructions change, but it is the same count
and mix of instructions.

Before:

0000000000000830 <_mesa_Uniform4fv>:
     830:       48 83 ec 10             sub    $0x10,%rsp
     834:       49 89 d0                mov    %rdx,%r8
     837:       48 8b 15 00 00 00 00    mov    0x0(%rip),%rdx        # 83e <_mesa_Uniform4fv+0xe>
     83e:       89 f8                   mov    %edi,%eax
     840:       89 f1                   mov    %esi,%ecx
     842:       41 b9 02 00 00 00       mov    $0x2,%r9d
     848:       64 48 8b 3a             mov    %fs:(%rdx),%rdi
     84c:       48 8b 97 c8 01 02 00    mov    0x201c8(%rdi),%rdx
     853:       48 8b 72 70             mov    0x70(%rdx),%rsi
     857:       6a 04                   pushq  $0x4
     859:       89 c2                   mov    %eax,%edx
     85b:       e8 00 00 00 00          callq  860 <_mesa_Uniform4fv+0x30>
     860:       48 83 c4 18             add    $0x18,%rsp
     864:       c3                      retq

After:

00000000000007f0 <_mesa_Uniform4fv>:
     7f0:       48 83 ec 10             sub    $0x10,%rsp
     7f4:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7fb <_mesa_Uniform4fv+0xb>
     7fb:       41 b9 02 00 00 00       mov    $0x2,%r9d
     801:       64 48 8b 08             mov    %fs:(%rax),%rcx
     805:       48 8b 81 c8 01 02 00    mov    0x201c8(%rcx),%rax
     80c:       6a 04                   pushq  $0x4
     80e:       4c 8b 40 70             mov    0x70(%rax),%r8
     812:       e8 00 00 00 00          callq  817 <_mesa_Uniform4fv+0x27>
     817:       48 83 c4 18             add    $0x18,%rsp
     81b:       c3                      retq

Saves a measly 416 bytes of text on x64.  Depending on exactly when this
is applied, a lot of variation is possible due to function alignment.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so after
6343348	 293872	  29880	6667100	 65bb5c	lib64/i965_dri.so before
6342932	 293872	  29880	6666684	 65b9bc	lib64/i965_dri.so after

There is likely to be no performance change with just this patch.
_mesa_uniform immediately calls validate_uniform_parameters with
parameters in the "wrong" (different from the call site) order.

v2: Rebase on GL_ARB_gpu_shader_fp64.

v3: Rebase on GL_ARB_gpu_shader_int64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:14 -08:00
Ian Romanick
9f7ac45ce4 mesa: Arrange _mesa_uniform_matrix parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on
x64.  On IA32, the details of the instructions change, but it is the
same count and mix of instructions.

Before:

0000000000001380 <_mesa_UniformMatrix4fv>:
    1380:       48 83 ec 10             sub    $0x10,%rsp
    1384:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 138b <_mesa_UniformMatrix4fv+0xb>
    138b:       41 89 f8                mov    %edi,%r8d
    138e:       41 89 f1                mov    %esi,%r9d
    1391:       0f b6 d2                movzbl %dl,%edx
    1394:       64 48 8b 38             mov    %fs:(%rax),%rdi
    1398:       48 8b b7 c8 01 02 00    mov    0x201c8(%rdi),%rsi
    139f:       48 8b 76 70             mov    0x70(%rsi),%rsi
    13a3:       68 06 14 00 00          pushq  $0x1406
    13a8:       51                      push   %rcx
    13a9:       52                      push   %rdx
    13aa:       b9 04 00 00 00          mov    $0x4,%ecx
    13af:       ba 04 00 00 00          mov    $0x4,%edx
    13b4:       e8 00 00 00 00          callq  13b9 <_mesa_UniformMatrix4fv+0x39>
    13b9:       48 83 c4 28             add    $0x28,%rsp
    13bd:       c3                      retq

After:

0000000000001360 <_mesa_UniformMatrix4fv>:
    1360:       48 83 ec 10             sub    $0x10,%rsp
    1364:       48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 136b <_mesa_UniformMatrix4fv+0xb>
    136b:       0f b6 d2                movzbl %dl,%edx
    136e:       64 4c 8b 00             mov    %fs:(%rax),%r8
    1372:       49 8b 80 c8 01 02 00    mov    0x201c8(%r8),%rax
    1379:       68 06 14 00 00          pushq  $0x1406
    137e:       6a 04                   pushq  $0x4
    1380:       6a 04                   pushq  $0x4
    1382:       4c 8b 48 70             mov    0x70(%rax),%r9
    1386:       e8 00 00 00 00          callq  138b <_mesa_UniformMatrix4fv+0x2b>
    138b:       48 83 c4 28             add    $0x28,%rsp
    138f:       c3                      retq

Saves a measly 576 bytes of text on x64.

   text	   data	    bss	    dec	    hex	filename
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so before
6670131	 228340	  22552	6921023	 699b3f	lib/i965_dri.so after
6343924	 293872	  29880	6667676	 65bd9c	lib64/i965_dri.so before
6343348	 293872	  29880	6667100	 65bb5c	lib64/i965_dri.so after

v2: Rebase on GL_ARB_gpu_shader_fp64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:09 -08:00
Ian Romanick
874393186b mesa: Trivial clean-ups in uniform_query.cpp
This is C++, so we can mix code and declarations.  Doing so allows
constification.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 09:46:07 -08:00
Lionel Landwerlin
bbe8705c57 spirv: handle undefined components for OpVectorShuffle
Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:31:21 +00:00
Lionel Landwerlin
df7063cba3 spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:29:29 +00:00
Lionel Landwerlin
c3421106ec anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.

Fixes a crash due to out of bound memory access in:
   dEQP-VK.api.descriptor_pool.out_of_pool_memory

v2: Drop debug traces (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:24:21 +00:00
Kenneth Graunke
f8f7ea508b i965: Make intelEmitCopyBlit not truncate large strides.
When trying to blit larger tiled surfaces, the pitch can be larger than
32768 bytes, which means it won't fit in a GLshort.  Passing it in will
truncate the stride to 0, which has...surprising results.

The pitch can be up to 32,768 DWords, or 128kB.  We measure it in bytes,
but divide by 4 when programming it.  So we need to handle values up to
131,072.  Switch from GLshort to int32_t to avoid the truncation.

Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage
at widths greater than 8192.

v2: Use int32_t as negative values can be used (Jason).

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-26 01:43:20 -08:00
Kenneth Graunke
fcf723b647 i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register.  With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky.  Use a UW type to avoid this.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-01-26 00:52:52 -08:00
Iago Toral Quiroga
9b25769da6 anv/lower_input_attachments: honor sample index parameter to subpassLoad()
According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:

gvec4 subpassLoad(gsubpassInput   subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);

So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.

Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 08:11:21 +01:00
Kenneth Graunke
5106df85da i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong.  But it turns out that
there is a legitimate reason to use it, so let's do so.

The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2017-01-25 22:24:08 -08:00
Chad Versace
022e5c7e5a anv: Implement VK_KHR_get_physical_device_properties2
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 19:18:47 -08:00
Chad Versace
cd03021c83 anv: Refactor anv_GetPhysicalDeviceQueueFamilyProperties()
Add a helper function, anv_get_queue_family_properties(), which fills the
struct.  This patch reduces churn in the following patch that implements
vkGetPhysicalDeviceQueueFamilyProperties2KHR.

Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 19:18:46 -08:00
Chad Versace
5826190095 anv: Refactor anv_GetPhysicalDeviceFormatProperties()
Add a helper function, anv_get_image_format_properties(), which does all
the work and has a VkPhysicalDeviceImageFormatInfo2KHR parameter. This
patch reduces churn in the following patch that implements
vkGetPhysicalDeviceImageFormatProperties2KHR.

Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 19:18:43 -08:00
Chad Versace
b2de77a07d anv: Revive struct anv_common
The struct was deleted by:
  commit efe9d1cde3
  Author: Edward O'Callaghan <funfunctor@folklore1984.net>
  Subject: anv: Clean up some unused variables

Unlike the original anv_common, the new one has a non-const pNext
pointer because we will use it for the output structs of
VK_KHR_get_physical_device_properties2.

v2:
  - Retype pNext from void* to struct anv_common*.

Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 19:18:33 -08:00
Chad Versace
c5d99c9983 anv: Define macro anv_debug()
This is a printf-like macro that prints a debug message to stderr when
built with DEBUG.  If no DEBUG, then do nothing.

Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 19:17:45 -08:00
Ian Romanick
fd43bee0ea mesa: Fix copy-and-paste bug in _mesa_(Program|)Uniform[1234](i|ui)64vARB functions
All of the functions were passing 1 to _mesa_uniform instead of passing
count.

Fixes 16 unsed parameter warnings like:

main/uniforms.c: In function ‘_mesa_Uniform1i64vARB’:
main/uniforms.c:1692:47: warning: unused parameter ‘count’ [-Wunused-parameter]
 _mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value)
                                               ^~~~~

This is why I build with extra warnings enabled.  Unfortunately, there
are so many unused parameter warnings in Mesa that I didn't notice these
added warnings for over 6 months. :(

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-25 09:28:40 -08:00
Lionel Landwerlin
173dd60ced spirv: bump headers to SPIRV 1.1
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-25 17:22:23 +00:00
Lionel Landwerlin
05e2d99bf2 spirv: add default handler for new enums
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-25 17:22:23 +00:00
Lionel Landwerlin
4fd54d611f spirv: fix typos
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-25 17:21:15 +00:00
Lionel Landwerlin
25e21cb8d0 anv: set command buffer to NULL when allocations fail
The spec section 5.2 says:

   "vkAllocateCommandBuffers can be used to create multiple command
   buffers. If the creation of any of those command buffers fails, the
   implementation must destroy all successfully created command buffer
   objects from this command, set all entries of the pCommandBuffers
   array to VK_NULL_HANDLE and return the error."

Fixes:
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
   dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-25 17:15:30 +00:00
Jason Ekstrand
d6397dd625 vulkan/wsi: Lower the maximum image sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
2017-01-25 09:05:30 -08:00
Jason Ekstrand
659edd9f5c vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
2017-01-25 09:05:25 -08:00
Jason Ekstrand
dc578ef060 vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
2017-01-25 09:04:56 -08:00
George Kyriazis
e259efd805 swr: Update fs texture & sampler state logic
In swr_update_derived() update texture and sampler state on a new fragment
shader.  GALLIUM_HUD can update fs using a previously bound texture and
sampler.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-01-25 10:02:50 -06:00
Samuel Pitoiset
cff199ceb7 gallium/radeon: add a new HUD query for the number of mapped buffers
Useful when debugging applications which map a ton of buffers
and also because we used to run into Linux's limit on the number
of simultaneous mmap() calls.

v2: - update the commit message

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-25 15:19:21 +01:00
Iago Toral Quiroga
56495080ed spirv: handle gl_SampleMask
SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same
builtin (SampleMask). The only way to tell which one we are dealing with
is to check if it is an input or an output.

Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 08:08:16 +01:00
Iago Toral Quiroga
9467d78d38 spirv: acknowledge multisampled input attachments
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-25 08:07:09 +01:00
Dave Airlie
2ab2be092d radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.

This preempt fixes a bunch of geom shader tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-25 09:58:38 +10:00
Marek Olšák
eac7df43ca radeonsi: handle first_non_void correctly in si_create_vertex_elements
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 23:52:01 +01:00
Marek Olšák
d9ef549238 st/mesa: destroy pipe_context before destroying st_context (v2)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.

Firefox with WebGL2 enabled hits this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456

v2: protect against a double destroy in st_create_context_priv and callers.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 23:52:01 +01:00
Timothy Arceri
dd65f0efc9 nir: bump loop max unroll limit
The original number was chosen in an attempt to match the limits applied to
GLSL IR.

A look at the git history of the why these limits were chosen for GLSL IR
shows it was more to do with the slow speed of unrolling large loops in
GLSL IR than anything else. The speed of loop unrolling in NIR is not a
problem so we may wish to bump this even higher in future.

No shader-db change, however a furture change will disbale the GLSL IR
optimisation loop in the i965 backend results in 4 loops from The Talos
Principle failing to unroll. Bumping the limit allows them to unroll which
results in the instruction count matching the previous output from when the
GLSL IR opts were still enabled.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-25 09:43:29 +11:00
Timothy Arceri
34ab9b0947 glsl: lower constant arrays to uniform arrays before optimisation loop
Previously the constant array would not get copy propagated until the backend
did its GLSL IR opt loop. I plan on removing that from i965 shortly which
caused huge regressions in Deus-ex and Tomb Raider which have large
constant arrays. Moving lowering before the opt loop in the GLSL linker
fixes this and unexpectedly improves some compute shaders also.

shader-db results BDW:

instructions helped:   shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%)
instructions helped:   shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%)
instructions helped:   shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%)

cycles helped:   shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%)
cycles helped:   shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%)
cycles helped:   shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%)

total instructions in shared programs: 13060313 -> 13059877 (-0.00%)
instructions in affected programs: 1756 -> 1320 (-24.83%)
helped: 3
HURT: 0

total cycles in shared programs: 256586698 -> 256561910 (-0.01%)
cycles in affected programs: 2066104 -> 2041316 (-1.20%)
helped: 3
HURT: 0

V3: only call the opt loop if lowering progressed (Suggested by Eric)

V2: call opts before and after lowering (Suggested by Ken)

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-25 09:07:30 +11:00
Ian Romanick
c4a0c1efff mesa: Don't advertise GL_OES_read_format in core profile
OpenGL ES implementations are not allowed to ship ARB extensions, and
OpenGL implementations are not allowed to ship OES extensions.

The functionality is also included in GL_ARB_ES2_compatibility.  Ever
OpenGL core-profile driver currently exposes both extensions.  I don't
know of any applications that explicitly check for GL_OES_read_format,
so removing it seems very unlikely to cause problems.  No functionality
is removed.

I have left this extension in place for compatibility profile.  There
are still OpenGL 1.x drivers in Mesa, and adding code to check for
compatibility profile and not GL_ARB_ES2_compatibility for
GL_IMPLEMENTATION_COLOR_READ_TYPE and GL_IMPLEMENTATION_COLOR_READ_FORMAT
just feels dumb.

Three other other alternatives considered:

 - Remove the string from compatibility profile drivers but leave the
   functionality in place.

 - Add a flag to expose the extension string, and set it in every OpenGL
   driver that does not expose GL_ARB_ES2_compatibility (and those
   drivers only).  I tried this.  You can't have two instances of an
   extension in the extension table (one dummy_true for ES1 and one with
   a flag for compatibility profile), so the implementation requires a
   bit of effort.

 - Only expose the extension in compatibility if the version is less
   than 2.0.  I didn't see an easy way to do this.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2017-01-24 13:39:26 -08:00
Brian Paul
b87eedd405 docs: fix incorrect link to 12.0.6 release notes
Trivial.
2017-01-24 14:30:44 -07:00
Jason Ekstrand
a435991d3c anv: Expose VK_KHR_maintenance1
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Jason Ekstrand
756533520e anv: Return better errors from AllocateDescriptorSets
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Jason Ekstrand
99bb4c22a5 anv: Allow selecting the slice of a 3D image
As per VK_KHR_maintenance1, clients can render to a slice of a 3D image
by creating a VK_IMAGE_VIEW_TYPE_2D view of it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Jason Ekstrand
6d79111834 anv: Report FORMAT_FEATURE_TRANSFER_SRC/DST_BIT_KHR
As of VK_KHR_maintenance1, these are supposed to be reported for any
formats on which we support transfer operations.  For us, this is
anything that we can texture from.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Jason Ekstrand
8a8630486b anv: Add trivial support for TrimCommandPoolKHR
Our command buffers already efficiently use a global pool so trimming
doesn't really need to do anything.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Jason Ekstrand
5edcc96bf6 anv: Set viewport extents correctly when height is negative
As per VK_KHR_maintenance1, setting a negative height in the viewport
can be used to get flipped coordinates.  This is, aparently, very useful
when porting D3D apps to Vulkan.  All we need to do to support this is
to make sure we actually set the min and max correctly.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-24 12:27:48 -08:00
Matt Turner
045f38a507 vulkan: Don't install vk_platform.h or vulkan.h.
These files belong to the vulkan loader.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 11:27:20 -08:00
Roland Scheidegger
aceae09ef0 glsl: fix compile errors with mingw due to missing PRIx64 definitions
define __STDC_FORMAT_MACROS and include <inttypes.h> (same as
ir_builder_print_visitor.cpp already does).

Otherwise, some mingw build errors out (since
8e7e1ae036 and
bbce1c538d presumably) with:
src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before ‘PRIu64’
   case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break;

(Note even with that fix I get other format specifier warnings:
src/compiler/glsl/ir_print_visitor.cpp:473:47:
warning: unknown conversion type character ‘a’ in format [-Wformat=]
                fprintf(f, "%a", ir->value.f[i]);
                                               ^
src/compiler/glsl/ir_print_visitor.cpp:473:47:
warning: too many arguments for format [-Wformat-extra-args]
but it still compiles at least)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Roland Scheidegger
f4df21ed95 gallivm: don't try to use fast rcp for fdiv
The use of fast rcp instruction is disabled, and will always fall back
to use a division instead (1 / x). Hence, if we get a division opcode,
it doesn't make much sense trying to split that into rcp/mul.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Roland Scheidegger
25208949d7 gallivm: (trivial) fix ddiv cpu implementation
we can't use the cpu implementation of fdiv, as this one uses different
lp_build_context, which causes assertion failure.
Just use default fdiv action (there is no fast rcp for doubles which we
could potentially use anyway).

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Roland Scheidegger
3b575a955c tgsi: implement ddiv opcode
softpipe (along with llvmpipe) claims to support arb_gpu_shader_fp64,
so we really need to support that opcode.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-24 19:12:46 +01:00
Jason Ekstrand
4c180f9633 i965/blorp: Use the correct ISL format for combined depth/stencil
In brw_blorp_copyteximage, we use the format from the render buffer.
This could be a combined depth/stencil format.  In this case, we handle
stencil properly but we give blorp the wrong ISL format.  Specifically,
we would give blorp ISL_FORMAT_R32G32B32A32_FLOAT which is the wrong
size was causing GPU hangs.

Fixes: GL45-CTS.gtf30.GL3Tests.packed_depth_stencil.packed_depth_stencil_copyteximage

Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-01-24 10:06:07 -08:00
Samuel Pitoiset
0054dded03 st/glsl_to_tgsi: fix compilation warnings since int64 types
state_tracker/st_glsl_to_tgsi.cpp:302:28: warning: ‘glsl_to_tgsi_instruction::tex_type’
	is too small to hold all values of ‘enum glsl_base_type’
    glsl_base_type tex_type:4;

Fixes: 8ce53d4a2f ("glsl: Add basic ARB_gpu_shader_int64 types")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 12:45:39 +01:00
Samuel Pitoiset
d90d37db73 gallium/radeon: undef the very specific UPDATE_COUNTER macro
Also, wrap this into a do { ... } while (0). Suggested by Nicolai.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-24 11:17:25 +01:00
Topi Pohjolainen
ba6399df94 i965/blorp: Add also depth and stencil buffers to render cache
v2 (Jason, Curro): Add stencil also even though it is not
                   enabled yet.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-01-24 10:41:58 +02:00
Ben Widawsky
e63ab36d0e gbm: Fix width height getters return type (trivial)
v2: Other way round... to make consistent, make both return type have
the fixed width - uint32_t.

Cc: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-01-23 21:43:38 -08:00
Ben Widawsky
bb9ff98b4c gbm: Move getters to match order in header file (trivial)
Other things are out of order, but I need to add a getter so I'm just
fixing those.

This helps people adding to GBM know where the right place to put things
is.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Daniel Stone <daniels@collabora.com>
2017-01-23 21:43:34 -08:00
Emil Velikov
530cd248f5 docs: add news item and link release notes for 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 02:15:30 +00:00
Emil Velikov
9b16bd8b6c docs: use correct year for the 12.0.6 release notes
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 13953f012d)
2017-01-24 02:15:30 +00:00
Emil Velikov
c16e7e0a60 docs: add sha256 checksums for 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 36e3f2542d)
2017-01-24 02:15:30 +00:00
Emil Velikov
b1137cb9de docs: add release notes for 12.0.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 555885a0bf)
2017-01-24 02:15:30 +00:00
Emil Velikov
9924cdecd9 docs/releasing: remove stray "cd"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-24 02:15:29 +00:00
Ilia Mirkin
b755f2f233 nv50: add support for MUL_ZERO_WINS property
This is simply keyed off the vertex shader, as that's guaranteed to be
present in any pipeline.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-01-23 20:37:14 -05:00
Ilia Mirkin
8c764a2321 nvc0: add support for MUL_ZERO_WINS property
This sets the dnz flag on all the relevant multiplication operations. At
emission time, this will only be supported by nvc0+, so nv50 will need a
different solution.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-01-23 20:37:14 -05:00
Ilia Mirkin
e1346f25bf st/nine: set the MUL_ZERO_WINS flag when supported
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:37:10 -05:00
Ilia Mirkin
6e40938fbc gallium: add PIPE_CAP_TGSI_MUL_ZERO_WINS
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:36:47 -05:00
Ilia Mirkin
a2b2cd81d1 gallium: add TGSI_PROPERTY_MUL_ZERO_WINS
This will be useful for proper D3D9 emulation, where this behavior is
expected by some shaders.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-01-23 20:35:55 -05:00
Marek Olšák
573bf0940a radeonsi: always set the TCL1_ACTION_ENA when invalidating L2
Some CIK-VI docs say this is the default behavior on SI. That doesn't
answer whether it's also the default behavior on CIK-VI.

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
5d3dd70cab radeonsi: don't declare LDS in TES
not used since we started using the offchip tess ring

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
59c5da40ed radeonsi: preload PS inputs only if KILL is used
so that most shaders can get lower VGPR usage thanks to lazy input loading.
I think this is a more accurate constraint that prevents the black transitions
in Witcher 2.

Affected shaders (7758):
Max Waves: 57437 -> 58231 (1.38 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
7b32ae4df5 gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layout
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
e248390e93 winsys/amdgpu: drop all IBs if at least one was rejected within the context
The corruption is inevitable and hangs are possible too.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Marek Olšák
1840800860 winsys/amdgpu: report a rejected IB as a lost context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 23:43:38 +01:00
Dave Airlie
dcfcb3047c vulkan: import latest registry for 1.0.39 extensions.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-24 08:13:37 +10:00
Dave Airlie
e38bee34bf vulkan: bump vulkan.h to 1.0.39 version
This introduces a bunch of new extension defines.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-24 08:13:23 +10:00
Grazvydas Ignotas
f65b3641c3 radv: don't resubmit the same cs over and over while tracing
Fixes: 97dfff54 ("radv: Dump command buffer on hang.")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
2017-01-23 22:27:05 +01:00
Samuel Pitoiset
aa2ace8e49 gallium/radeon: add HUD queries for monitoring some hw blocks
It's also possible to monitor them via performance counters but
the hardware can only use two counters simultaneously. It seems
easier to re-use the existing code which reads from MMIO instead
of writing a multi-pass approach.

v2: - add new lines after ':'

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-23 21:19:49 +01:00
Samuel Pitoiset
a704f19247 gallium/radeon: refactor the GRBM counters path
This will allow to expose more queries in order to know which
blocks are busy/idle.

v2: - add new lines after ':'

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-23 21:19:49 +01:00
George Kyriazis
00847e4f14 swr: Align query results allocation
Some query results struct contents are declared as cache line aligned.
Use aligned malloc, and align the whole struct, to be safe.

Fixes crash when compiling with clang.

CC: <mesa-stable@lists.freedesktop.org>

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-01-23 14:15:54 -06:00
Bruce Cherniak
b829206b07 swr: Prune empty nodes in CalculateProcessorTopology.
CalculateProcessorTopology tries to figure out system topology by
parsing /proc/cpuinfo to determine the number of threads, cores, and
NUMA nodes.  There are some architectures where the "physical id" begins
with 1 rather than 0, which was creating and empty "0" node and causing a
crash in CreateThreadPool.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-01-23 13:52:26 -06:00
Matt Turner
d349449a16 i965: Use UNUSED to silence unused variable (used in assert). 2017-01-23 10:50:20 -08:00
Rainer Hochecker
09b140abb5 dri: allow 16bit R/GR images to be exported via drm buffers
This allows eglCreateImageKHR to access P010 surfaces created by vaapi

Signed-off-by: Rainer Hochecker <fernetmenta@online.de>
Acked-by: Ben Widawky <ben@bwidawsk.net>
2017-01-23 08:47:15 -08:00
Christian König
1338d912f5 st/va: make sure that we call begin_frame() only once v2
This fixes "st/va: delay calling begin_frame until we have all parameters".

v2: call begin frame after decoder (re)creation as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Tested-by: Andy Furniss <adf.lists@gmail.com>
2017-01-23 17:00:04 +01:00
Eric Engestrom
50141e131a drirc: remove spurious tabs
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 16:34:58 +01:00
Nicolai Hähnle
cfabbbcfd7 st/glsl_to_tgsi: use DDIV instead of DRCP + DMUL
Fixes GL45-CTS.gpu_shader_fp64.built_in_functions.

v2: use DDIV unconditionally (Roland)

Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:26 +01:00
Nicolai Hähnle
b71c415c3d glsl: split DIV_TO_MUL_RCP into single- and double-precision flags
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:19 +01:00
Nicolai Hähnle
e4f8f9a638 r600: implement DDIV
Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:15 +01:00
Nicolai Hähnle
488560cfe6 r600: factor out cayman_emit_unary_double_raw
We will use it for DDIV.

Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:17:12 +01:00
Nicolai Hähnle
76b02d2fe1 r600: double multiply can handle only one multiply at a time
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.

Tested-by: Glenn Kennard <glenn.kennard@gmail.com>
Tested-by: James Harvey <lothmordor@gmail.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-23 16:15:45 +01:00
Timothy Arceri
f3f9207786 glsl: fix tes linking regression
Fixes regression caused by cbeba6bd48. I accidentally pushed the
wrong version of the patch.
2017-01-23 19:07:22 +11:00
Timothy Arceri
38a67f020d mesa: remove unused gl_shader_info field from gl_linked_shader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
79f07e87c9 mesa/glsl: set and get cs layouts to and from shader_info
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
b96bddae67 mesa/glsl: set and get gs layouts directly to and from shader_info
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
cbeba6bd48 mesa/glsl/i965: set and get tes layouts directly to and from shader_info
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
64e201ab8f glsl: use last_vert_prog to get last {clip,cull}_distance_array_size
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
fc707f570f mesa/glsl: set {clip,cull}_distance_array_size directly in gl_program
There are some line wrapping violations here but those lines will get
deleted in the following patch.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
f86d15ed94 st/mesa/glsl: change xfb_program field to last_vert_prog
Now that the i965 backend doesn't depend on this field we can
make it more generic and short circuit a bunch of code paths.

The new field will be used in a following patch for another
clean-up.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-23 14:48:04 +11:00
Timothy Arceri
c505d6d852 mesa: use gl_program for CurrentProgram rather than gl_shader_program
This makes much more sense and should be more performant in some
critical paths such as SSO validation which is called at draw time.

Previously the CurrentProgram array could have contained multiple
pointers to the same struct which was confusing and we would often
need to fish out the information we were really after from the
gl_program anyway.

Also it was error prone to depend on the _LinkedShader array for
programs in current use because a failed linking attempt will lose
the infomation about the current program in use which is still
valid.

V2: fix validate_io() to compare linked_stages rather than the
consumer and producer to decide if we are looking at inward
facing shader interfaces which don't need validation.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>

To avoid build regressions the following 2 patches were squashed in to
this commit:

mesa/meta: rewrite _mesa_shader_program_use() and _mesa_program_use()

These are rewritten to do what the function name suggests, that is
_mesa_shader_program_use() sets the use of all stage and
_mesa_program_use() sets the use of a single stage.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>

mesa: update active relinked program

This likely fixes a subroutine bug were
_mesa_shader_program_init_subroutine_defaults() would never have been
called for the relinked program as we previously just set
_NEW_PROGRAM as dirty and never called the _mesa_use* functions when
linking.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-23 14:48:04 +11:00
Rob Clark
31daeb5bf1 freedreno/a5xx: set frag shader threadsize
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:12:05 -05:00
Rob Clark
8d6af93e76 freedreno/a5xx: set fragcoordxy properly
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into
bary.f.  We were incorrectly setting both this and gl_FragCoord.xy to
the same register resulting in all sorts of hilarity.

Fixes stk, vdrift, 0ad, probably a bunch others.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:43 -05:00
Rob Clark
278b97946f freedreno/ir3: setup var locations in standalone compiler
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-01-22 14:11:26 -05:00
Rob Clark
6cc93bedc1 freedreno/a5xx: fix psize
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on
a5xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:15 -05:00
Rob Clark
141a4f86d6 freedreno/a5xx: srgb fix
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:11:04 -05:00
Rob Clark
69fbb458cf freedreno/a5xx: fix int vbos
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:54 -05:00
Rob Clark
16671e9704 freedreno/a5xx: fix clear for uint/sint formats
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:42 -05:00
Rob Clark
4d9aa4f67d freedreno/a5xx: fix cull state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:10:28 -05:00
Rob Clark
4c39458460 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-22 14:09:45 -05:00
Lionel Landwerlin
494b63f525 anv: descriptors: don't update immutables samplers with anything but their immutable value
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-21 19:22:27 +00:00
Jason Ekstrand
bb96b03461 nir/search: Use the correct bit size for integer comparisons
The previous code always compared integers as 64-bit.  Due to variations
in sign-extension in the code generated by nir_opt_algebraic.py, this
meant that nir_search doesn't always do what you want.  Instead, 32-bit
values should be matched as 32-bit and 64-bit values should be matched
as 64-bit.  While we're here we unify the unsigned and signed paths.
Now that we're using the right bit size, they should be the same since
the only difference we had before was sign extension.

This gets the UE4 bitfield_extract optimization working again.  It had
stopped working due to the constant 0xff00ff00 getting sign-extended
when it shouldn't have.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-21 10:34:21 -08:00
Jason Ekstrand
817f9e3b17 intel/blorp/copy: Properly handle clear colors for CCS_E images
In order to handle CCS_E, we stomp the image format to a UINT format and
then do some bitcasting logic in the shader.  This works fine since SKL
render compression only considers the channel layout of the format and
not the format itself.  In order for this to work on images that have
been fast-cleared, we need to also convert the clear color so that, when
interpreted as UINT, it provides the same bit value as it would have in
the original format.  This fixes a bunch of OpenGL ES CTS tests for
copy_image when we start using CCS more aggressively.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-21 10:34:09 -08:00
Kenneth Graunke
bb5db5564f glsl: Rename [u]int64_t tokens.
basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict
with these.  Append "_TOK" to avoid conflicts.

Should fix the Windows build.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 19:39:20 -08:00
Matt Turner
892781d6c7 Revert "i965: Really don't emit Q or UQ moves on Gen < 8"
This reverts commit c95380c404.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 19:12:31 -08:00
Matt Turner
d871f8e820 i965: Select DF type for 64-bit integers on Gen < 8.
Gen8 adds Q/UQ types. We attempted to change the types back to DF in the
generator (commit c95380c40), but an assertion added in the FP64 series
(commit e481dcc3) triggers before that code has a chance to execute.

In fact, using Q/UQ in the IR and then changing to DF in the generator
would not work in the presence of source modifiers, etc.

Fixes: d6fcede6 ("i965: Return Q and UQ types for int64 and uint64")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 19:12:24 -08:00
Ian Romanick
db6d23cfd2 i965: Enable ARB_gpu_shader_int64 on Gen8+
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
fc16bf125f i965: Split SIMD16 CMP of Q and UQ instructions
This is basically the same as happens for doubles.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
51807c6493 i965: Enable 64-bit integer support for almost all unary and binary operations
Integer comparison functions (e.g., nir_op_ilt) are handled in the next
commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
821d7cece8 i965: Enable uploading 64-bit integer uniforms
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
e0579c5017 i965: Add 64-bit integer support for conversions and bitcasts
v2 (idr): Make the "from" type in a cast unsized.  This reduces the
number of required cast operations at the expensive slightly more
complex code.  However, this will be a dramatic improvement when other
sized integer types are added.  Suggested by Connor.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
f2fa510594 i965: Enable emitting Q and UQ instructions in the fs backend
v2: Fixup assertion in brw_reg_type_to_hw_type to allow
BRW_REGISTER_TYPE_{UQ,Q} on Gen8+.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
409e0b2d48 i965: Add support for constant evaluation on Q and UQ types
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
d6fcede60f i965: Return Q and UQ types for int64 and uint64
It seems like maybe this should return a different type based on Gen.  Q
and UQ only exist on Gen8+, but, based on the old comment, I believe
previous Gens can generate 64-bit moves.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
c95380c404 i965: Really don't emit Q or UQ moves on Gen < 8
It's much easier to do this in the generator rather than while coming
out of NIR.  brw_type_for_nir_type doesn't know the Gen, so we'd have to
add a bunch of plumbing.  The alternate fix is to not emit int64 moves
for doubles in the first place... but that seems even more difficult.

This change won't catch non-MOV instructions that try to use 64-bit
integer types on Gen < 8.  This may convert certain kinds of bugs in to
different kinds of bugs that are more difficult to detect (since the
assertions in the function won't catch them).

NOTE: I don't think anything can emit mixed-type 64-bit moves until the
same platform supports both ARB_gpu_shader_fp64 and
ARB_gpu_shader_int64.  When we enable int64 on Gen < 8, we can solve
this problem other ways.

This prevents regressions on HSW in the next patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
30164d501d nir: Add support for 64-bit integer types to split_var_copies_block
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
3c9b35372b nir: Enable 64-bit integer support for almost all unary and binary operations
v2: Don't up-convert the shift count parameter if shift instructions.
Suggested by Connor.  Add type_is_singed() function.  This will make
adding 8- and 16-bit types easier.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 15:41:23 -08:00
Ian Romanick
fda33e09d8 nir: Shift count for shift opcodes is always 32-bits
Previously both sources were unsized.  This caused problems when the
thing being shifted was 64-bit but the shift count was 32-bit.  The
expectation in NIR is that all unsized sources (and destination) will
ultimately have the same size.

The changes in nir_opt_algebraic.py are to prevent errors like:

 Failed to parse transformation:
03:12:25   (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options->lower_extract_byte')
03:12:25 Traceback (most recent call last):
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__
03:12:25     xform = SearchAndReplace(xform)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__
03:12:25     BitSizeValidator(varset).validate(self.search, self.replace)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate
03:12:25     validate_dst_class = self._validate_bit_class_up(replace)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up
03:12:25     src_class = self._validate_bit_class_up(val.sources[i])
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up
03:12:25     assert src_class == src_type_bits
03:12:25 AssertionError

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 15:41:23 -08:00
Ian Romanick
8ad74a2745 nir: Lower packing and unpacking of 64-bit integer types
This change makes me wonder whether double packing should be
reimplemented as int64BitsToDouble(packInt2x32(v)).  I'm a little on the
fence since not all platforms that support fp64 natively support int64.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
3460d05a71 nir: Add 64-bit integer support for conversions and bitcasts
v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v3 (idr): Make the "from" type in a cast unsized.  This reduces the
number of required cast operations at the expensive slightly more
complex code.  However, this will be a dramatic improvement when other
sized integer types are added.  Suggested by Connor.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
3ca0029a0d nir: Add 64-bit integer constant support
v2: Rebase on 19a541f (nir: Get rid of nir_constant_data)

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v1]
2017-01-20 15:41:23 -08:00
Ian Romanick
48e122244b nir: Add GLSL_TYPE_INT64 and GLSL_TYPE_UINT64 to glsl_get_bit_size
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
81952814a3 glsl: Optimize redundant pack(unpack()) and unpack(pack()) combinations
The lowering passes 64-bit integer operations will generate a lot of
these.

v2: Modify the HANDLE_PACK_UNPACK_INVERSE so that the breaks apply to
the switch instead of the 'do { } while(true)' loop.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
7122d851aa glsl: Add a lowering pass for 64-bit integer modulus
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
695b04f7eb glsl: Add "built-in" functions to do 64%64 => 64 modulus
These functions are directly available in shaders.  A #define is added
to detect the presence.  This allows these functions to be tested using
piglit regardless of whether the driver uses them for lowering.  The
GLSL spec says that functions and macros beginning with __ are reserved
for use by the implementation... hey, that's us!

v2: Use function inlining.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
82c31f3eb9 glsl: Add a lowering pass for 64-bit integer division
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
012f2995c3 glsl: Add "built-in" functions to do 64/64 => 64 division
These functions are directly available in shaders.  A #define is added
to detect the presence.  This allows these functions to be tested using
piglit regardless of whether the driver uses them for lowering.  The
GLSL spec says that functions and macros beginning with __ are reserved
for use by the implementation... hey, that's us!

v2: Use function inlining.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
50d52df278 glsl: Add a lowering pass for 64-bit integer sign()
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
6b03b345eb glsl: Add "built-in" function for 64-bit integer sign()
These functions are directly available in shaders.  A #define is added
to detect the presence.  This allows these functions to be tested using
piglit regardless of whether the driver uses them for lowering.  The
GLSL spec says that functions and macros beginning with __ are reserved
for use by the implementation... hey, that's us!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
6c3af04363 glsl: Add a lowering pass for 64-bit integer multiplication
v2: Rename lower_64bit.cpp and lower_64bit_test.cpp to lower_int64.
Suggested by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
330fc2413c glsl: Add "built-in" functions to do 64x64 => 64 multiplication
These functions are directly available in shaders.  A #define is added
to detect the presence.  This allows these functions to be tested using
piglit regardless of whether the driver uses them for lowering.  The
GLSL spec says that functions and macros beginning with __ are reserved
for use by the implementation... hey, that's us!

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
aa38bf1e59 glsl: Move builtin_function related prototypes to a separate file
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
8358e58f25 glsl/standalone: Enable ARB_gpu_shader_int64
v2: Add missing break in GLSL_TYPE_INT64 case.  Notice by Matt.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
8dfea5348c i965: Avoid int64 warnings.
Just add operations to the switch statement here.

v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
c101cee2ba i965: Avoid int64 induced warnings
Just add types into unsupported or double equivalent spots.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
a53f315ad8 mesa/program: Add unused ir operations.
v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
f82ced5af3 glsl: Allow GLSL_TYPE_INT64 for ir_unop_abs and ir_unop_sign
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
8e7e1ae036 glsl: Print GLSL_TYPE_UINT64 and GLSL_TYPE_INT64 values
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Ian Romanick
0d14fec345 glsl: Add interaction between ARB_gpu_shader_int64 and ARB_shader_clock
If ARB_gpu_shader_int64 is supported, ARB_shader_clock also adds
clockARB() that returns a uint64_t.  Rather than add new opcodes and
intrinsics for this, just wrap the existing intrinsic with a
packUint2x32.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
bfc4080d38 glsl: Add 64-bit integer functions
These are all the allowed 64-bit functions from ARB_gpu_shader_int64
spec.

v2: restrict int64/double functions better.

v3 (idr): Delete spurious blank lines.  Suggested by Matt.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
050f38ef0b glsl/varying_packing: Add 64-bit integer support
As for the double code, but using the 64-bit integer conversions.

v2 (idr): Remove some spurious u2i() and i2u() operations when packing
and unpacking, respectively, int64_t varyings.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
923aebdd46 glsl/ast: Add 64-bit integer support in some places.
Just add support in two more places in ast parsing.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
9ba9a7f854 glsl: Add 64-bit integer support to some operations.
This adds 64-bit integer support to some AST and IR operations where
it is needed.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
25c7a61b28 glsl/ir_builder: Add support for some 64-bit bitcasts.
We need builder support to implement some of the builtins.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
78cc44280e glsl/ast: Add 64-bit integer support to conversion functions
This adds support to call the new operations on conversions.

v2 (idr): Delete an unnecessary break-statement.  Noticed by Matt.  Add
a missing blank line.  Noticed by Ian.

v3 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
85faf5082f glsl: Add 64-bit integer support for constant expressions
This just adds the new operations and add 64-bit integer support to all
the existing cases where it is needed.

v2: fix some issues found in testing.
v2.1: add unreachable (Ian), add missing int/uint pack/unpack (Dave).

v3 (idr): Rebase on top of idr's series to generate
ir_expression_operation_constant.h. In addition, this version:

    Adds missing support for ir_unop_bit_not, ir_binop_all_equal,
    ir_binop_any_nequal, ir_binop_vector_extract,
    ir_triop_vector_insert, and ir_quadop_vector.

    Removes support for uint64_t from ir_unop_abs and ir_unop_sign.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v3]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
a68b6ee063 glsl/ir: Add support for 64-bit integer conversions.
This adds all the conversions in the world, I'm not 100% sure of all of
these are needed, but add all of them and we can cut them down later.

v2: fix issue with packing output types.

v3 (idr): Rebase on top of idr's series to generate
ir_expression_operation_constant.h.  Fix transposed ir_validate
assertions for ir_unop_u642i64 and ir_unop_i642u64.  Add missing
automatic type setup for ir_unop_u642i64 and ir_unop_i642u64.

v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v3]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
7dd63c10c3 glsl: Add 64-bit integer support to uniform initialiser code
Just add support to the double case, same code should work.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
8df5287c23 glsl/varyings: Add 64-bit integer support.
This adds 64-bit ints to the link_varyings 64-bit support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
bbce1c538d glsl/ast/ir: Add 64-bit integer constant support
This adds support for 64-bit integer constants to the parser,
ast and ir.

v2: fix a few issues found in testing.

v3: Add missing ir_constant copy contructor support.

v4: Use PRIu64 and PRId64 in printfs in glsl_parser_extras.cpp.
Suggested by Nicolai.  Rebase on Marek's linalloc changes.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v2]
Reviewed-by: Matt Turner <mattst88@gmail.com> [v3]
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
249007d13c mesa: Add support for 64-bit integer uniforms
This hooks up the API to the internals for 64-bit integer uniforms.

v2: update to use non-strict aliased alternatives

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
8ce53d4a2f glsl: Add basic ARB_gpu_shader_int64 types
This adds the builtins and the lexer support.

To avoid too many warnings, it adds basic support to the type in a few
other places in mesa, mostly in the trivial places.

It also adds a query to be used later for if a type is an integer 32 or 64.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
e90830bb8e glsl: Add ARB_gpu_shader_int64 boilerplate.
This just adds the basic boilerplate support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
839ce21143 mesa: Add ARB_gpu_shader_int64 extension bits
This just adds the usual boilerplate in mesa core.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Dave Airlie
150f2fa789 mapi: Add support for ARB_gpu_shader_int64.
Just add the boilerplate xml code.

v2 (idr): Update dispatch_sanity.  Only add extension functions in core
profile.

v3 (idr): Remove comment line from gl_API.xml.  Suggested by Matt.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-01-20 15:41:23 -08:00
Lionel Landwerlin
74c23bde5b anv: don't require render target isl bit for depth/stencil surfaces
Blorp can deal with depth/stencil surfaces blits/copies without the
render target requirement. Also having both render target and
depth/stencil requirement is incompatible from isl's point of view.

This fixes an image creation issue in the high level quality settings
of the Unity3D player, which requires a depth texture with src/dst
transfer & 4x multisampling.

v2: Simply aspect checking condition (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-20 21:39:51 +00:00
Lionel Landwerlin
8a28e764d0 spirv: don't assert with location decorations on non i/o variables
Some applications might add location decoration to samplers. Rather
than raising an error it seems it would make more sense to just
discard these decorations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
2017-01-20 21:39:46 +00:00
Matt Turner
f57bdd4849 i965: Validate "Special Cases for Byte Operations"
Do this in general_restrictions_based_on_operand_types() because the two
rules that "Special Cases for Byte Operations" relax are checked there.
2017-01-20 11:40:52 -08:00
Matt Turner
75b7f5a269 i965: Validate "Region Alignment Rules" 2017-01-20 11:40:52 -08:00
Matt Turner
f817d132c1 i965: Validate "General Restrictions Based on Operand Types" 2017-01-20 11:40:52 -08:00
Matt Turner
83696b2234 i965: Validate "General Restrictions on Regioning Parameters" 2017-01-20 11:40:52 -08:00
Matt Turner
df0b7bcdfd i965: Replace reg_type_size[] with a function.
A function is necessary to handle immediate types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
ada891d472 i965: Validate math instruction sources.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
fce0612fc2 i965: Claim that SEND/math has two sources.
src1 must be a descriptor (including the information to determine that
the SEND is doing an extended math operation), but src0 can actually be
null since it serves as the source of the implicit GRF -> MRF move.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
c9724682b5 i965: Simplify num_sources_from_inst().
desc will always be non-NULL, because brw_validate_instructions() does
not attempt to validate any instructions that fail the
is_unsupported_inst() check.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
9fd12666d0 i965: Factor out send_restrictions() function.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
7abc65dd7c i965: Factor out sources_not_null() validation function.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
a693305b61 i965: Structure code so unsupported inst will not generate more errors.
We want to rely on brw_opcode_desc() always returning non-NULL in other
validation functions. Other validation functions will be in the else
case of the block added in this patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
f0429359cc i965: Add a test for the EU assembly validator.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
ae9c69e1cf i965: Add a CHECK macro to call more complicated validation funcs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
25448e4b7e i965: Make ERROR_IF usable from other functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
f9a4fc9b15 i965: Mark error annotation on correct SIMD16 inst.
inst, whose assignment can be seen in the last line of context pointed
to the correct instruction in the SIMD16 program, but src_offset was the
offset from the beginning of the SIMD16 program.

So if an instruction at offset 0x100 in the SIMD16 program was illegal,
we would mark an error on the instruction at offset 0x100 (which is
likely in the SIMD8 program).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
59003f3447 i965/vec4: Use UW-typed operands when dest is UW.
Using a UD-typed operand makes the execution size D, and if the size of
the execution type is greater than the size of the destination type, the
destination must be appropriately strided.

We actually just want UW-types all around.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
68bcbfa9e4 i965: Use W-typed immediate in brw_F32TO16().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
3eada948a0 gtest: Update to 1.8.0.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-20 11:40:52 -08:00
Matt Turner
cbc39e541f i965: Don't change F->VF if dest type is DF.
We change the immediate source type to VF to allow instruction
compaction, but there are no entires in the compaction table for DF, so
there's no point in doing this.

Additionally, I mixing floating-point types is now allowed except for
F and VF.
2017-01-20 11:40:52 -08:00
Lionel Landwerlin
a72dea9483 anv: fix comment typo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 16:46:32 +00:00
Lionel Landwerlin
0c3d058723 spirv: fix warn string typo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 16:46:29 +00:00
Lionel Landwerlin
bac6fe5c77 blorp: remove unnecessary struct declaration
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 16:46:21 +00:00
Marek Olšák
74f40d1570 Revert "radeonsi: reject invalid vertex element formats"
This reverts commit 9e4d1d8a7c.

It broke arb_vertex_type_10f_11f_11f_rev-draw-vertices, which has
first_non_void == -1.
2017-01-20 16:02:45 +01:00
Philipp Zabel
a37cf630b4 gallium: add pipe_screen::resource_changed callback wrappers
Add resource_changed to the ddebug, rbug, and trace wrappers. Since it
is optional, there is no need to add it to noop.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Suggested-by: Nicolai Hähnle <nhaehnle@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
97de7e6586 st/mesa: ask pipe driver to recreate derived internal resources when (re-)binding external textures
Use the resource_changed callback to invalidate internal resources
derived from external textures when they are (re-)bound. This is needed
to comply with the requirement from the GL_OES_EGL_image_external
extension that a call to glBindTexture guarantees that all further
sampling will return values that correspond to the values in the
external texture at or after the time that glBindTexture was called.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
9bab714c61 mesa: update external textures when (re-)binding
To comply with the requirement from the GL_OES_EGL_image_external
extension that a call to glBindTexture guarantees that all further
sampling will return values that correspond to the values in the
external texture at or after the time that glBindTexture was called,
do not bail out early from mesa_BindTextures if the target is
external.
This will later allow the state tracker to instruct the pipe driver
to invalidate internal resources derived from the external texture.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
c70ed79e79 etnaviv: implement resource_changed to invalidate internal resources derived from imported buffers
Implement the resource_changed pipe callback to invalidate internal
resources derived from imported buffers. This is needed to update the
texture for re-imported renderables.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:30 +01:00
Philipp Zabel
362edc868c etnaviv: initialize seqno of imported resources
Imported resources already have contents that we want to be copied to
texture resources derived from them. Set initial seqno of imported
resources to 1, just as if it had already been rendered to.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Philipp Zabel
2c95d6dac3 st/dri: ask the driver to update its internal copies on reimport
For imported buffers that can't be used directly as a source to the
texture samplers, the pipe driver might need to create an internal
copy, for example in a different tiling layout. When buffers are
reimported they may contain new image data, so the driver internal
copies need to be recreated.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Philipp Zabel
30853f55a3 gallium: add pipe_screen::resource_changed
Add a hook to tell drivers that an imported resource may have changed
and they need to update their internal derived resources.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-01-20 15:30:29 +01:00
Emil Velikov
5872850b88 configure.ac: move require_dri_shared_libs_and_glapi() before its users
Otherwise we'll get a lovely message as below:
"require_dri_shared_libs_and_glapi: command not found"

Cc: Steven Newbury <steve@snewbury.org.uk>
Reported-by: Steven Newbury <steve@snewbury.org.uk>
Fixes: da410e6afa "configure: explicitly require shared glapi for
enable-dri"
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Steven Newbury <steve@snewbury.org.uk>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-20 14:27:08 +00:00
Samuel Pitoiset
383fc8e9f3 gallium/hud: add missing break in hud_cpufreq_graph_install()
Fixes: e99b9395be "gallium/hud: Add support for CPU frequency monitoring"
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-01-20 10:33:47 +01:00
Tapani Pälli
4148881513 android: correct typo in build
Fixes: 63c58dfc65
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-20 07:49:10 +02:00
Elie TOURNIER
9fdaeb7776 nir: add min/max optimisation
Add the following optimisations:

min(x, -x) = -abs(x)
min(x, -abs(x)) = -abs(x)
min(x, abs(x)) = x
max(x, -abs(x)) = x
max(x, abs(x)) = abs(x)
max(x, -x) = abs(x)

shader-db:

total instructions in shared programs: 13067779 -> 13067775 (-0.00%)
instructions in affected programs: 249 -> 245 (-1.61%)
helped: 4
HURT: 0

total cycles in shared programs: 252054838 -> 252054806 (-0.00%)
cycles in affected programs: 504 -> 472 (-6.35%)
helped: 2
HURT: 0

Signed-off-by: Elie Tournier <tournier.elie@gmail.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-19 21:44:28 -08:00
Jason Ekstrand
f22ee14644 nir/algebraic: Only include nir_search_helpers once
We were including it once per value, so probably around 10k times.
Let's not cause the compiler any more work than we have to.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-01-19 21:40:30 -08:00
Anuj Phogat
6de293284b i965: Remove unnecessary mt->compressed checks
It's harmless to use ALIGN_NPOT() for uncompressed formats
because they have block width/height = 1.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-01-19 14:28:18 -08:00
Anuj Phogat
c7e37a0cb8 i965: Fix indentation in brw_miptree_layout_2d()
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-01-19 14:28:18 -08:00
Anuj Phogat
47d9b3a9dd i965: Fix comment to include 3d textures
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-01-19 14:28:18 -08:00
Chad Versace
de0b0a3a9c i965: Delete pending CCS and HiZ ops in intel_miptree_make_shareable()
Fixes crash in piglit
`egl_khr_gl_renderbuffer_image-clear-shared-image GL_DEPTH_COMPONENT24`
on Skylake.

The crash happened because blorp attempted to execute a pending hiz
clear after the hiz buffer was deleted. Deleting the pending hiz ops
when the hiz buffer gets deleted fixes the crash.

For good measure, this patch also deletes all pending CCS/MCS ops when
the CCS/MCS buffer gets deleted. I'm now aware of any bugs
caused by the dangling ops, but deleting them is clearly the right thing
to do.

Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99265
2017-01-19 13:47:57 -08:00
Andres Rodriguez
e0674e740b vulkan/wsi: clarify the severity of lack of DRI3 v2
The current message sounds like a small warning, clarify that it can
result in lack of presentation support and application crashes.

v2: add "if they do" (Bas)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98263
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Jason ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-19 15:41:42 +00:00
Andres Rodriguez
a3ad6a34c6 radv: fix include order for installed headers v2
In situations where libdrm_amdgpu and mesa are installed to the same
location, the mesa installed headers will take precedence over the git
source headers.

This is due to the AMDGPU_CFLAGS containing the install directory.

This situation can cause build errors if the git version of a header is
newer than the currently installed version of a header (e.g. git pull
updates vulkan.h)

Note: using the same install prefix for mesa and libdrm is probably a
common occurrence since it is described in the radeonBuildHowTo wiki:
https://www.x.org/wiki/radeonBuildHowTo/

v2: added sign-off

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-19 15:41:38 +00:00
Emil Velikov
0f8afde7ba docs/releasing: document post branch version bump
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-19 15:38:30 +00:00
Emil Velikov
49e4204b12 mesa: Bump version to 17.1.0-devel
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-19 15:38:30 +00:00
Marek Olšák
9e4d1d8a7c radeonsi: reject invalid vertex element formats
This should fix a coverity defect.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 16:38:37 +01:00
Marek Olšák
e490b7812c radeonsi: don't forget to add HTILE to the buffer list for texturing
This fixes VM faults. Discovered by Samuel Pitoiset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450

Cc: 17.0 13.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 16:38:37 +01:00
Nayan Deshmukh
31908d6a4a st/vdpau: only send buffers with B8G8R8A8 format to X
PresentPixmap only works if the pixmap depth matches with the
window depth, otherwise it returns a BadMatch protocol error.
Even if the depths match, the result won't look correctly
if the VDPAU RGB component order doesn't match the X11 one so
we only allow the X11 format.
For other buffers we copy them to a buffer which is send to X.

v2: only send buffers with format VDP_RGBA_FORMAT_B8G8R8A8
v3: reword commit message
v4: add comment explaining the code

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-01-19 15:34:02 +01:00
Nicolai Hähnle
3cd092c415 radeonsi: fix texture gather on stencil textures
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.

A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.

Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.

Fixes GL45-CTS.texture_cube_map_array.sampling.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-01-19 15:02:57 +01:00
Alejandro Piñeiro
905961452a mesa/main: Fix FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for NONE attachment type
When the attachment type is NONE (att->Type),
FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE should be NONE always.

Note that technically, the current behaviour follows the spec. From
OpenGL 4.5 spec, Section 9.2.3 "Framebuffer Object Queries":

   "If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
    either no framebuffer is bound to target; or the default
    framebuffer is bound, attachment is DEPTH or STENCIL, and the
    number of depth or stencil bits, respectively, is zero."

Reading literally this paragraph, for the default framebuffer, NONE
should be only returned if attachment is DEPTH and STENCIL without
being allocated.

But it doesn't makes too much sense to return DEFAULT_FRAMEBUFFER if
the attachment type is NONE. For example, this can happens if the
attachment is FRONT_RIGHT run on monoscopic mode, as that attachment
is only available on stereo mode.

With the current behaviour, defensive querying of the object type
would not work properly. So you could query the object type checking
for NONE, get DEFAULT_FRAMEBUFFER, and then get and INVALID_OPERATION
when requesting other pnames (like RED_SIZE), as the real attachment
type is NONE.

This fixes:
GL45-CTS.direct_state_access.framebuffers_get_attachment_parameters

v2: don't change the behaviour for att->Type != GL_NONE, as caused
    some ES CTS regressions
v3: simplify condition (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-01-19 11:55:41 -02:00
Zachary Michaels
d7d32b3bfe radeonsi: Always leave poly_offset in a valid state
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-19 10:50:16 +01:00
Nicolai Hähnle
a7c635ec65 mesa/main: fix meta caller of _mesa_ClampColor
Since _mesa_ClampColor properly checks for support of the API function
now, it's meta callers need to check support as well.

Fixes: 963311b71f ("mesa/main: fix version/extension checks in _mesa_ClampColor")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99401
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
2017-01-19 09:13:25 +01:00
Timothy Arceri
4d65f68a9b mesa/glsl: move TransformFeedbackBufferStride to gl_shader
Here we remove the single use of this field in gl_linked_shader
which allows us to move the field out of gl_shader_info

While we are at it we rewrite link_xfb_stride_layout_qualifiers()
to be more clear.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
e603cf1841 glsl: exit loop early if we find xfb layout qualifers
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
7983ed5f65 glsl: set InnerCoverage directly in gl_program
Also move out of the shared gl_shader_info.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
1f141eaef6 glsl: tidy up PostDepthCoverage shader field
There is no reason for this to be in the shared gl_shader_info or
to copy it to gl_program at the end of linking (its already there).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
3d41f4b990 mesa/glsl: move pixel_center_integer to gl_shader
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
0a9d102ddc mesa/glsl: move origin_upper_left to gl_shader
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
ceeedb9bb0 mesa/glsl: move uses_gl_fragcoord to gl_shader
This is only used by gl_linked_shader as a temp during linking
so use a temp there instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
66a6050ad8 mesa/glsl: move redeclares_gl_fragcoord to gl_shader
This is never used in gl_linked_shader other than as a temp
during linking so just use a temp instead.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
cc7ecce253 mesa/glsl: move ARB_fragment_coord_conventions_enable field
This is only used by gl_shader not gl_linked_shader so move it
there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
ae28c5a60c st/mesa/glsl: set early_fragment_tests directly in shader_info
We also move EarlyFragmentTests out of the gl_shader_info struct
as it is now only used by gl_shader.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
5c93d27423 mesa/glsl/i965: set and use tcs vertices_out directly
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Timothy Arceri
4cd709e2bc i965: get outputs_written from gl_program
There is no need to go via the pointer in nir_shader. This change
is required for the shader cache as we don't create a nir_shader.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 17:05:26 +11:00
Dave Airlie
ef71b867ee gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIAN
This fixes the build on ppc/s390.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-19 16:00:53 +10:00
Timothy Arceri
3fe8d04a6d mesa: don't always set _NEW_PROGRAM when linking
We only need to set it when linking was successful and the program
being linked is currently active.

The programs_in_use mask is just used as a flag for now but in
a future change we will use it to update the CurrentProgram array.

V2: make sure to flush vertices before linking (suggested by Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-01-19 15:55:02 +11:00
Timothy Arceri
aad93402c0 mesa: change init subroutine defaults helper to work per gl_program
A later patch will result in SSO programs calling this helper
per gl_program rather than per gl_shader_program.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 15:55:02 +11:00
Timothy Arceri
90d950038f mesa/glsl: move ProgramResourceList to gl_shader_program_data
We also move NumProgramResourceList at the same time.

GLES does interface validation on SSO at runtime so we need to move
this to be able to switch to storing gl_program pointers in
CurrentProgram.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-19 15:55:02 +11:00
Timothy Arceri
62f718bfcb glsl: store number of explicit uniform loactions in gl_shader_program
This allows us to cleanup the functions that pass this count around,
but more importantly we will be able to call the uniform linking
functions from that backends linker without having to pass this
information to the backend directly via Driver.LinkShader().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-19 15:55:02 +11:00
Timothy Arceri
c054bbf0d4 glsl: create a new link_and_validate_uniforms() helper
Currently this just breaks up the linking code a bit but in the
future i965 will call this from the backend via Driver.LinkShader()
so that we can do NIR optimisations before assigning uniform
locations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-19 15:55:02 +11:00
Timothy Arceri
ce4fb3c8a1 glsl: make a bunch of varying linking functions static
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-19 15:55:02 +11:00
Timothy Arceri
90fffd1770 glsl: move more varying linking code to link_varyings.cpp
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-01-19 15:55:02 +11:00
Topi Pohjolainen
180653c357 i965/blorp: Make post draw flush more explicit
Blits do not need any special treatment as the target buffer
object is added to render cache just as one does for normal draw.
Color clears and resolves in turn require explicit "end of pipe
synchronization". It is not clear what this means exactly but the
assumption is that render cache flush with command stream stall
should be sufficient.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-18 22:42:47 +02:00
Topi Pohjolainen
46b346899d i965/gen6: Issue direct depth stall and flush after depth clear
instead of calling unconditionally brw_emit_mi_flush() which
does:

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_DEPTH_CACHE_FLUSH |
                                PIPE_CONTROL_RENDER_TARGET_FLUSH |
                                PIPE_CONTROL_CS_STALL);

   brw_emit_pipe_control_flush(brw,
                                PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
                                PIPE_CONTROL_CONST_CACHE_INVALIDATE);

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-18 22:42:47 +02:00
Topi Pohjolainen
e6da6943fe i965: Make depth clear flushing more explicit
Current blorp logic issues unconditional "flush everything"
(see brw_emit_mi_flush()) after each render. For example, all
blits issue this unconditionally which shouldn't be needed if
they set render cache properly so that subsequent renders do
necessary flushing before drawing.

In case of piglit:

ext_framebuffer_multisample-accuracy all_samples depth_draw small

intel_hiz_exec() is always preceded by blorb blit and the
unconditional flush looks to hide the lack of stall and flushes
in depth clears. By removing the brw_emit_mi_flush() I get gpu
hangs.

This patch adds the stalls and flushes mandated by the spec
and gets rid of those hangs.

v2 (Jason, Ken): Document the rational for separating
                 depth cache flush and stall on Gen7.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-18 22:42:47 +02:00
Topi Pohjolainen
4840a53e90 i965/blorp: Use the render cache mechanism instead of explicit flushing
by replacing brw_emit_mi_flush() with brw_render_cache_set_check_flush().
The latter splits the flush in two:

   brw_emit_pipe_control_flush(brw,
                               PIPE_CONTROL_DEPTH_CACHE_FLUSH |
                               PIPE_CONTROL_RENDER_TARGET_FLUSH |
                               PIPE_CONTROL_CS_STALL);

   brw_emit_pipe_control_flush(brw,
                               PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
                               PIPE_CONTROL_CONST_CACHE_INVALIDATE);

instead of

   int flags = PIPE_CONTROL_NO_WRITE | PIPE_CONTROL_RENDER_TARGET_FLUSH;
   if (brw->gen >= 6) {
      flags |= PIPE_CONTROL_INSTRUCTION_INVALIDATE |
               PIPE_CONTROL_CONST_CACHE_INVALIDATE |
               PIPE_CONTROL_DEPTH_CACHE_FLUSH |
               PIPE_CONTROL_VF_CACHE_INVALIDATE |
               PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
               PIPE_CONTROL_CS_STALL;
   }
   brw_emit_pipe_control_flush(brw, flags);

v2 (Jason): Check that destination exists before trying to add to
            render cache. Depth clears and resolves don't have it.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-18 22:42:47 +02:00
Emil Velikov
ea8b2624c8 utils: really remove the __END_DECLS macro
Fixes: d1efa09d34 "util: import sha1 implementation from OpenBSD"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-18 20:09:57 +00:00
Emil Velikov
9f8dc3bf03 utils: build sha1/disk cache only with Android/Autoconf
Earlier commit imported a SHA1 implementation and relaxed the SHA1 and
disk cache handling, broking the Windows builds.

Restrict things for now until we get to a proper fix.

Fixes: d1efa09d34 "util: import sha1 implementation from OpenBSD"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-01-18 20:09:01 +00:00
3040 changed files with 448047 additions and 190354 deletions

View File

@@ -33,3 +33,7 @@ indent_size = 2
[*.patch]
trim_trailing_whitespace = false
[meson.build,meson_options.txt]
indent_style = space
indent_size = 2

2
.gitignore vendored
View File

@@ -7,11 +7,13 @@
*.log
*.o
*.obj
*.orig
*.os
*.pc
*.pdb
*.pyc
*.pyo
*.rej
*.so
*.so.*
*.sw[a-z]

View File

@@ -1,53 +1,414 @@
language: c
sudo: true
sudo: false
dist: trusty
cache:
directories:
- $HOME/.ccache
addons:
apt:
packages:
- libdrm-dev
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libxcb-dri2-0-dev
- libx11-xcb-dev
- llvm-3.5-dev
# llvm-config is not in the dev package?
- llvm-3.5
# LLVM packaging is broken and misses this dep.
- libedit-dev
- scons
apt: true
ccache: true
env:
global:
- XORG_RELEASES=http://xorg.freedesktop.org/releases/individual
- XCB_RELEASES=http://xcb.freedesktop.org/dist
- WAYLAND_RELEASES=http://wayland.freedesktop.org/releases
- XORGMACROS_VERSION=util-macros-1.19.0
- GLPROTO_VERSION=glproto-1.4.17
- DRI2PROTO_VERSION=dri2proto-2.8
- DRI3PROTO_VERSION=dri3proto-1.0
- PRESENTPROTO_VERSION=presentproto-1.0
- LIBPCIACCESS_VERSION=libpciaccess-0.13.4
- LIBDRM_VERSION=libdrm-2.4.74
- XCBPROTO_VERSION=xcb-proto-1.11
- LIBXCB_VERSION=libxcb-1.11
- LIBXSHMFENCE_VERSION=libxshmfence-1.2
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig
matrix:
- BUILD=make
- BUILD=scons
- LIBVDPAU_VERSION=libvdpau-1.1
- LIBVA_VERSION=libva-1.6.2
- LIBWAYLAND_VERSION=wayland-1.11.1
- WAYLAND_PROTOCOLS_VERSION=wayland-protocols-1.8
- PKG_CONFIG_PATH=$HOME/prefix/lib/pkgconfig:$HOME/prefix/share/pkgconfig
- LD_LIBRARY_PATH="$HOME/prefix/lib:$LD_LIBRARY_PATH"
- PATH="$HOME/prefix/bin:$PATH"
matrix:
include:
- env:
- LABEL="make loaders/classic DRI"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make check"
- DRI_LOADERS="--enable-glx --enable-gbm --enable-egl --with-platforms=x11,drm,surfaceless,wayland --enable-osmesa"
- DRI_DRIVERS="i915,i965,radeon,r200,swrast,nouveau"
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- env:
# NOTE: Building SWR is 2x (yes two) times slower than all the other
# gallium drivers combined.
# Start this early so that it doesn't hunder the run time.
- LABEL="make Gallium Drivers SWR"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="swr"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium Drivers Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="i915,nouveau,pl111,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,etnaviv,imx"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-3.9"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.7
- OVERRIDE_CXX=g++-4.7
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.7
# From sources above
- llvm-3.9-dev
- clang-3.9
- libclang-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-4.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=4.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-4.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-4.0-dev
- clang-4.0
- libclang-4.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
# NOTE: Analogous to SWR above, building Clover is quite slow.
- LABEL="make Gallium ST Clover LLVM-5.0"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=5.0
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- OVERRIDE_CC=gcc-4.8
- OVERRIDE_CXX=g++-4.8
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--disable-dri --enable-opencl --enable-opencl-icd --enable-llvm --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS="r600,radeonsi"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-5.0
packages:
- libclc-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- g++-4.8
# From sources above
- llvm-5.0-dev
- clang-5.0
- libclang-5.0-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Gallium ST Other"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="true"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --disable-opencl --enable-xa --enable-nine --enable-xvmc --enable-vdpau --enable-va --enable-omx-bellagio --enable-gallium-osmesa"
# We need swrast for osmesa and nine.
# i915 most likely doesn't work with most ST.
# Regardless - we're doing a quick build test here.
- GALLIUM_DRIVERS="i915,swrast"
- VULKAN_DRIVERS=""
- LIBUNWIND_FLAGS="--enable-libunwind"
addons:
apt:
packages:
# We actually want to test against llvm-3.3
- llvm-3.3-dev
# Nine requires gcc 4.6... which is the one we have right ?
- libxvmc-dev
# Build locally, for now.
#- libvdpau-dev
#- libva-dev
- libomxil-bellagio-dev
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- libunwind8-dev
- env:
- LABEL="make Vulkan"
- BUILD=make
- MAKEFLAGS="-j4"
- MAKE_CHECK_COMMAND="make -C src/gtest check && make -C src/intel check"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# New binutils linker is required for llvm-3.9
- OVERRIDE_PATH=/usr/lib/binutils-2.26/bin
- DRI_LOADERS="--disable-glx --disable-gbm --disable-egl --with-platforms=x11,wayland"
- DRI_DRIVERS=""
- GALLIUM_ST="--enable-dri --enable-dri3 --disable-opencl --disable-xa --disable-nine --disable-xvmc --disable-vdpau --disable-va --disable-omx-bellagio --disable-gallium-osmesa"
- GALLIUM_DRIVERS=""
- VULKAN_DRIVERS="intel,radeon"
- LIBUNWIND_FLAGS="--disable-libunwind"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- binutils-2.26
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons"
- BUILD=scons
- SCONSFLAGS="-j4"
# Explicitly disable.
- SCONS_TARGET="llvm=0"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=0 check"
addons:
apt:
packages:
- scons
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons LLVM"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="llvm=1"
# Keep it symmetrical to the make build.
- SCONS_CHECK_COMMAND="scons llvm=1 check"
- LLVM_VERSION=3.3
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
addons:
apt:
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
- llvm-3.3-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="scons SWR"
- BUILD=scons
- SCONSFLAGS="-j4"
- SCONS_TARGET="swr=1"
- LLVM_VERSION=3.9
- LLVM_CONFIG="llvm-config-${LLVM_VERSION}"
# Keep it symmetrical to the make build. There's no actual SWR, yet.
- SCONS_CHECK_COMMAND="true"
- OVERRIDE_CC="gcc-4.8"
- OVERRIDE_CXX="g++-4.8"
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
- scons
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libelf-dev
- env:
- LABEL="meson Vulkan"
- BUILD=meson
- MESON_OPTIONS="-Ddri-drivers= -Dgallium-drivers="
addons:
apt:
sources:
- llvm-toolchain-trusty-3.9
packages:
# LLVM packaging is broken and misses these dependencies
- libedit-dev
# From sources above
- llvm-3.9-dev
# Common
- xz-utils
- libexpat1-dev
- libelf-dev
- python3-pip
- env:
- LABEL="meson loaders/classic DRI"
- BUILD=meson
- MESON_OPTIONS="-Dvulkan-drivers= -Dgallium-drivers="
addons:
apt:
packages:
- xz-utils
- x11proto-xf86vidmode-dev
- libexpat1-dev
- libx11-xcb-dev
- libxdamage-dev
- libxfixes-dev
- python3-pip
install:
- export PATH="/usr/lib/ccache:$PATH"
- pip install --user mako
# Install the latest meson from pip, since the version in the ubuntu repos is
# often quite old.
- if test "x$BUILD" = xmeson; then
pip3 install --user meson;
fi
# Since libdrm gets updated in configure.ac regularly, try to pick up the
# latest version from there.
- for line in `grep "^LIBDRM_.*_REQUIRED=" configure.ac`; do
- for line in `grep "^LIBDRM.*_REQUIRED=" configure.ac`; do
old_ver=`echo $LIBDRM_VERSION | sed 's/libdrm-//'`;
new_ver=`echo $line | sed 's/.*REQUIRED=//'`;
if `echo "$old_ver,$new_ver" | tr ',' '\n' | sort -Vc 2> /dev/null`; then
@@ -70,14 +431,6 @@ install:
- tar -jxvf $DRI2PROTO_VERSION.tar.bz2
- (cd $DRI2PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$DRI3PROTO_VERSION.tar.bz2
- tar -jxvf $DRI3PROTO_VERSION.tar.bz2
- (cd $DRI3PROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XORG_RELEASES/proto/$PRESENTPROTO_VERSION.tar.bz2
- tar -jxvf $PRESENTPROTO_VERSION.tar.bz2
- (cd $PRESENTPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget $XCB_RELEASES/$XCBPROTO_VERSION.tar.bz2
- tar -jxvf $XCBPROTO_VERSION.tar.bz2
- (cd $XCBPROTO_VERSION && ./configure --prefix=$HOME/prefix && make install)
@@ -92,21 +445,75 @@ install:
- wget http://dri.freedesktop.org/libdrm/$LIBDRM_VERSION.tar.bz2
- tar -jxvf $LIBDRM_VERSION.tar.bz2
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-etnaviv-experimental-api && make install)
- (cd $LIBDRM_VERSION && ./configure --prefix=$HOME/prefix --enable-vc4 --enable-freedreno --enable-etnaviv-experimental-api && make install)
- wget $XORG_RELEASES/lib/$LIBXSHMFENCE_VERSION.tar.bz2
- tar -jxvf $LIBXSHMFENCE_VERSION.tar.bz2
- (cd $LIBXSHMFENCE_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://people.freedesktop.org/~aplattner/vdpau/$LIBVDPAU_VERSION.tar.bz2
- tar -jxvf $LIBVDPAU_VERSION.tar.bz2
- (cd $LIBVDPAU_VERSION && ./configure --prefix=$HOME/prefix && make install)
- wget http://www.freedesktop.org/software/vaapi/releases/libva/$LIBVA_VERSION.tar.bz2
- tar -jxvf $LIBVA_VERSION.tar.bz2
- (cd $LIBVA_VERSION && ./configure --prefix=$HOME/prefix --disable-wayland --disable-dummy-driver && make install)
- wget $WAYLAND_RELEASES/$LIBWAYLAND_VERSION.tar.xz
- tar -axvf $LIBWAYLAND_VERSION.tar.xz
- (cd $LIBWAYLAND_VERSION && ./configure --prefix=$HOME/prefix --enable-libraries --without-host-scanner --disable-documentation --disable-dtd-validation && make install)
- wget $WAYLAND_RELEASES/$WAYLAND_PROTOCOLS_VERSION.tar.xz
- tar -axvf $WAYLAND_PROTOCOLS_VERSION.tar.xz
- (cd $WAYLAND_PROTOCOLS_VERSION && ./configure --prefix=$HOME/prefix && make install)
# Meson requires ninja >= 1.6, but trusty has 1.3.x
- wget https://github.com/ninja-build/ninja/releases/download/v1.6.0/ninja-linux.zip;
- unzip ninja-linux.zip
- mv ninja $HOME/prefix/bin/
# Generate the header since one is missing on the Travis instance
- mkdir -p linux
- printf "%s\n" \
"#ifndef _LINUX_MEMFD_H" \
"#define _LINUX_MEMFD_H" \
"" \
"#define __NR_memfd_create 319" \
"#define SYS_memfd_create __NR_memfd_create" \
"" \
"#define MFD_CLOEXEC 0x0001U" \
"#define MFD_ALLOW_SEALING 0x0002U" \
"" \
"#endif /* _LINUX_MEMFD_H */" > linux/memfd.h
script:
- if test "x$BUILD" = xmake; then
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
test -n "$OVERRIDE_PATH" && export PATH="$OVERRIDE_PATH:$PATH";
export CFLAGS="$CFLAGS -isystem`pwd`";
./autogen.sh --enable-debug
--with-egl-platforms=x11,drm
--with-dri-drivers=i915,i965,radeon,r200,swrast,nouveau
--with-gallium-drivers=svga,swrast,vc4,virgl,r300,r600,etnaviv,imx
$LIBUNWIND_FLAGS
$DRI_LOADERS
--with-dri-drivers=$DRI_DRIVERS
$GALLIUM_ST
--with-gallium-drivers=$GALLIUM_DRIVERS
--with-vulkan-drivers=$VULKAN_DRIVERS
--disable-llvm-shared-libs
;
make && make check;
elif test x$BUILD = xscons; then
scons;
&&
make && eval $MAKE_CHECK_COMMAND;
fi
- if test "x$BUILD" = xscons; then
test -n "$OVERRIDE_CC" && export CC="$OVERRIDE_CC";
test -n "$OVERRIDE_CXX" && export CXX="$OVERRIDE_CXX";
scons $SCONS_TARGET && eval $SCONS_CHECK_COMMAND;
fi
- if test "x$BUILD" = xmeson; then
export CFLAGS="$CFLAGS -isystem`pwd`";
meson _build $MESON_OPTIONS;
ninja -C _build;
fi

View File

@@ -30,19 +30,23 @@ LOCAL_C_INCLUDES += \
$(MESA_TOP)/include
MESA_VERSION := $(shell cat $(MESA_TOP)/VERSION)
# define ANDROID_VERSION (e.g., 4.0.x => 0x0400)
LOCAL_CFLAGS += \
-Wno-unused-parameter \
-Wno-date-time \
-Wno-pointer-arith \
-Wno-missing-field-initializers \
-Wno-initializer-overrides \
-Wno-mismatched-tags \
-DVERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_VERSION=\"$(MESA_VERSION)\" \
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\" \
-DANDROID_VERSION=0x0$(MESA_ANDROID_MAJOR_VERSION)0$(MESA_ANDROID_MINOR_VERSION)
-DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa\"
# XXX: The following __STDC_*_MACROS defines should not be needed.
# It's likely due to a bug elsewhere, but let's temporarily add them
# here to fix the radeonsi build.
LOCAL_CFLAGS += \
-DANDROID_API_LEVEL=$(PLATFORM_SDK_VERSION) \
-DENABLE_SHADER_CACHE \
-D__STDC_CONSTANT_MACROS \
-D__STDC_LIMIT_MACROS \
-DHAVE___BUILTIN_EXPECT \
-DHAVE___BUILTIN_FFS \
@@ -51,7 +55,7 @@ LOCAL_CFLAGS += \
-DHAVE_FUNC_ATTRIBUTE_UNUSED \
-DHAVE_FUNC_ATTRIBUTE_FORMAT \
-DHAVE_FUNC_ATTRIBUTE_PACKED \
_DHAVE_FUNC_ATTRIBUTE_ALIAS \
-DHAVE_FUNC_ATTRIBUTE_ALIAS \
-DHAVE___BUILTIN_CTZ \
-DHAVE___BUILTIN_POPCOUNT \
-DHAVE___BUILTIN_POPCOUNTLL \
@@ -59,10 +63,19 @@ LOCAL_CFLAGS += \
-DHAVE___BUILTIN_CLZLL \
-DHAVE___BUILTIN_UNREACHABLE \
-DHAVE_PTHREAD=1 \
-DHAVE_DLOPEN \
-DHAVE_DLADDR \
-DHAVE_DL_ITERATE_PHDR \
-DMAJOR_IN_SYSMACROS \
-fvisibility=hidden \
-Wno-sign-compare
LOCAL_CPPFLAGS += \
-D__STDC_CONSTANT_MACROS \
-D__STDC_FORMAT_MACROS \
-D__STDC_LIMIT_MACROS \
-Wno-error=non-virtual-dtor \
-Wno-non-virtual-dtor
# mesa requires at least c99 compiler
LOCAL_CONLYFLAGS += \
-std=c99
@@ -70,38 +83,23 @@ LOCAL_CONLYFLAGS += \
ifeq ($(strip $(MESA_ENABLE_ASM)),true)
ifeq ($(TARGET_ARCH),x86)
LOCAL_CFLAGS += \
-DUSE_X86_ASM \
-DUSE_X86_ASM
endif
endif
ifeq ($(MESA_ENABLE_LLVM),true)
LOCAL_CFLAGS += \
-DHAVE_LLVM=0x0305 -DMESA_LLVM_VERSION_PATCH=2 \
-D__STDC_CONSTANT_MACROS \
-D__STDC_FORMAT_MACROS \
-D__STDC_LIMIT_MACROS
ifeq ($(ARCH_ARM_HAVE_NEON),true)
LOCAL_CFLAGS_arm += -DUSE_ARM_ASM
endif
LOCAL_CFLAGS_arm64 += -DUSE_AARCH64_ASM
ifneq ($(LOCAL_IS_HOST_MODULE),true)
# add libdrm if there are hardware drivers
ifneq ($(filter-out swrast,$(MESA_GPU_DRIVERS)),)
LOCAL_CFLAGS += -DHAVE_LIBDRM
LOCAL_SHARED_LIBRARIES += libdrm
endif
endif
LOCAL_CPPFLAGS += \
$(if $(filter true,$(MESA_LOLLIPOP_BUILD)),-D_USING_LIBCXX) \
-Wno-error=non-virtual-dtor \
-Wno-non-virtual-dtor
ifeq ($(MESA_LOLLIPOP_BUILD),true)
LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"
LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/system/lib64/$(MESA_DRI_MODULE_REL_PATH)\"
else
LOCAL_CFLAGS += -DDEFAULT_DRIVER_DIR=\"/system/lib/$(MESA_DRI_MODULE_REL_PATH)\"
endif
LOCAL_CFLAGS_32 += -DDEFAULT_DRIVER_DIR=\"/vendor/lib/$(MESA_DRI_MODULE_REL_PATH)\"
LOCAL_CFLAGS_64 += -DDEFAULT_DRIVER_DIR=\"/vendor/lib64/$(MESA_DRI_MODULE_REL_PATH)\"
LOCAL_PROPRIETARY_MODULE := true
# uncomment to keep the debug symbols
#LOCAL_STRIP_MODULE := false

View File

@@ -24,7 +24,7 @@
# BOARD_GPU_DRIVERS should be defined. The valid values are
#
# classic drivers: i915 i965
# gallium drivers: swrast freedreno i915g ilo nouveau r300g r600g radeonsi vc4 virgl vmwgfx
# gallium drivers: swrast freedreno i915g nouveau pl111 r300g r600g radeonsi vc4 virgl vmwgfx etnaviv imx
#
# The main target is libGLES_mesa. For each classic driver enabled, a DRI
# module will also be built. DRI modules will be loaded by libGLES_mesa.
@@ -32,14 +32,8 @@
MESA_TOP := $(call my-dir)
MESA_ANDROID_MAJOR_VERSION := $(word 1, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_MINOR_VERSION := $(word 2, $(subst ., , $(PLATFORM_VERSION)))
MESA_ANDROID_VERSION := $(MESA_ANDROID_MAJOR_VERSION).$(MESA_ANDROID_MINOR_VERSION)
ifeq ($(filter 1 2 3 4,$(MESA_ANDROID_MAJOR_VERSION)),)
MESA_LOLLIPOP_BUILD := true
else
define local-generated-sources-dir
$(call local-intermediates-dir)
endef
ifneq ($(filter 2 4, $(MESA_ANDROID_MAJOR_VERSION)),)
$(error "Android 4.4 and earlier not supported")
endif
MESA_DRI_MODULE_REL_PATH := dri
@@ -49,19 +43,43 @@ MESA_DRI_MODULE_UNSTRIPPED_PATH := $(TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)/$(M
MESA_COMMON_MK := $(MESA_TOP)/Android.common.mk
MESA_PYTHON2 := python
classic_drivers := i915 i965
gallium_drivers := swrast freedreno i915g ilo nouveau r300g r600g radeonsi vmwgfx vc4 virgl
# Lists to convert driver names to boolean variables
# in form of <driver name>.<boolean make variable>
classic_drivers := i915.HAVE_I915_DRI i965.HAVE_I965_DRI
gallium_drivers := \
swrast.HAVE_GALLIUM_SOFTPIPE \
freedreno.HAVE_GALLIUM_FREEDRENO \
i915g.HAVE_GALLIUM_I915 \
nouveau.HAVE_GALLIUM_NOUVEAU \
pl111.HAVE_GALLIUM_PL111 \
r300g.HAVE_GALLIUM_R300 \
r600g.HAVE_GALLIUM_R600 \
radeonsi.HAVE_GALLIUM_RADEONSI \
vmwgfx.HAVE_GALLIUM_VMWGFX \
vc4.HAVE_GALLIUM_VC4 \
virgl.HAVE_GALLIUM_VIRGL \
etnaviv.HAVE_GALLIUM_ETNAVIV \
imx.HAVE_GALLIUM_IMX
MESA_GPU_DRIVERS := $(strip $(BOARD_GPU_DRIVERS))
# warn about invalid drivers
invalid_drivers := $(filter-out \
$(classic_drivers) $(gallium_drivers), $(MESA_GPU_DRIVERS))
ifneq ($(invalid_drivers),)
$(warning invalid GPU drivers: $(invalid_drivers))
# tidy up
MESA_GPU_DRIVERS := $(filter-out $(invalid_drivers), $(MESA_GPU_DRIVERS))
ifeq ($(BOARD_GPU_DRIVERS),all)
MESA_BUILD_CLASSIC := $(filter HAVE_%, $(subst ., , $(classic_drivers)))
MESA_BUILD_GALLIUM := $(filter HAVE_%, $(subst ., , $(gallium_drivers)))
else
# Warn if we have any invalid driver names
$(foreach d, $(BOARD_GPU_DRIVERS), \
$(if $(findstring $(d).,$(classic_drivers) $(gallium_drivers)), \
, \
$(warning invalid GPU driver: $(d)) \
) \
)
MESA_BUILD_CLASSIC := $(strip $(foreach d, $(BOARD_GPU_DRIVERS), $(patsubst $(d).%,%, $(filter $(d).%, $(classic_drivers)))))
MESA_BUILD_GALLIUM := $(strip $(foreach d, $(BOARD_GPU_DRIVERS), $(patsubst $(d).%,%, $(filter $(d).%, $(gallium_drivers)))))
endif
ifeq ($(filter x86%,$(TARGET_ARCH)),)
MESA_BUILD_CLASSIC :=
endif
$(foreach d, $(MESA_BUILD_CLASSIC) $(MESA_BUILD_GALLIUM), $(eval $(d) := true))
# host and target must be the same arch to generate matypes.h
ifeq ($(TARGET_ARCH),$(HOST_ARCH))
@@ -70,23 +88,25 @@ else
MESA_ENABLE_ASM := false
endif
ifneq ($(filter $(classic_drivers), $(MESA_GPU_DRIVERS)),)
MESA_BUILD_CLASSIC := true
else
MESA_BUILD_CLASSIC := false
ifneq ($(filter true, $(HAVE_GALLIUM_RADEONSI)),)
MESA_ENABLE_LLVM := true
endif
ifneq ($(filter $(gallium_drivers), $(MESA_GPU_DRIVERS)),)
MESA_BUILD_GALLIUM := true
else
MESA_BUILD_GALLIUM := false
endif
MESA_ENABLE_LLVM := $(if $(filter radeonsi,$(MESA_GPU_DRIVERS)),true,false)
define mesa-build-with-llvm
$(if $(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5), \
$(warning Unsupported LLVM version in Android $(MESA_ANDROID_MAJOR_VERSION)),) \
$(if $(filter 6,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0307 -DMESA_LLVM_VERSION_PATCH=0)) \
$(if $(filter 7,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0308 -DMESA_LLVM_VERSION_PATCH=0)) \
$(if $(filter 8,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \
$(if $(filter P,$(MESA_ANDROID_MAJOR_VERSION)), \
$(eval LOCAL_CFLAGS += -DHAVE_LLVM=0x0309 -DMESA_LLVM_VERSION_PATCH=0)) \
$(eval LOCAL_SHARED_LIBRARIES += libLLVM)
endef
# add subdirectories
ifneq ($(strip $(MESA_GPU_DRIVERS)),)
SUBDIRS := \
src/gbm \
src/loader \
@@ -96,15 +116,11 @@ SUBDIRS := \
src/util \
src/egl \
src/amd \
src/broadcom \
src/intel \
src/mesa/drivers/dri
src/mesa/drivers/dri \
src/vulkan
INC_DIRS := $(call all-named-subdir-makefiles,$(SUBDIRS))
ifeq ($(strip $(MESA_BUILD_GALLIUM)),true)
INC_DIRS += $(call all-named-subdir-makefiles,src/gallium)
endif
include $(INC_DIRS)
endif

View File

@@ -27,7 +27,7 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-egl \
--enable-gallium-tests \
--enable-gallium-osmesa \
--enable-gallium-llvm \
--enable-llvm \
--enable-gbm \
--enable-gles1 \
--enable-gles2 \
@@ -41,9 +41,10 @@ AM_DISTCHECK_CONFIGURE_FLAGS = \
--enable-xa \
--enable-xvmc \
--enable-llvm-shared-libs \
--with-egl-platforms=x11,wayland,drm,surfaceless \
--enable-libunwind \
--with-platforms=x11,wayland,drm,surfaceless \
--with-dri-drivers=i915,i965,nouveau,radeon,r200,swrast \
--with-gallium-drivers=i915,ilo,nouveau,r300,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \
--with-gallium-drivers=i915,nouveau,r300,pl111,r600,radeonsi,freedreno,svga,swrast,vc4,virgl,swr,etnaviv,imx \
--with-vulkan-drivers=intel,radeon
ACLOCAL_AMFLAGS = -I m4
@@ -53,14 +54,22 @@ EXTRA_DIST = \
common.py \
docs \
doxygen \
bin/git_sha1_gen.py \
scons \
SConstruct
SConstruct \
build-support/conftest.dyn \
build-support/conftest.map
noinst_HEADERS = \
include/c99_alloca.h \
include/c99_compat.h \
include/c99_math.h \
include/c11 \
include/drm-uapi/drm.h \
include/drm-uapi/drm_fourcc.h \
include/drm-uapi/drm_mode.h \
include/drm-uapi/i915_drm.h \
include/drm-uapi/vc4_drm.h \
include/D3D9 \
include/GL/wglext.h \
include/HaikuGL \

View File

@@ -58,6 +58,7 @@ F: src/compiler/nir/
DOCUMENTATION
R: Emil Velikov <emil.l.velikov@gmail.com>
R: Eric Engestrom <eric@engestrom.ch>
F: docs/
F: doxygen/
@@ -69,6 +70,10 @@ DRI LOADER
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/loader/
EGL
R: Eric Engestrom <eric@engestrom.ch>
F: src/egl/
GALLIUM LOADER
R: Emil Velikov <emil.l.velikov@gmail.com>
F: src/gallium/auxiliary/pipe-loader/
@@ -80,6 +85,7 @@ F: src/gallium/targets/
AUTOCONF BUILD
R: Emil Velikov <emil.l.velikov@gmail.com>
F: autogen.sh
F: configure.ac
F: */Automake.inc
F: */Makefile.*am
@@ -97,6 +103,12 @@ F: CleanSpec.mk
F: */Android.*mk
F: */Makefile.sources
MESON BUILD
R: Dylan Baker <dylan@pnwbakers.com>
R: Eric Engestrom <eric@engestrom.ch>
F: */meson.build
F: meson_options.txt
ANDROID EGL SUPPORT
R: Rob Herring <robh@kernel.org>
R: Tomasz Figa <tfiga@chromium.org>

View File

@@ -50,10 +50,10 @@ except KeyError:
pass
else:
targets = targets.split(',')
print 'scons: warning: targets option is deprecated; pass the targets on their own such as'
print
print ' scons %s' % ' '.join(targets)
print
print('scons: warning: targets option is deprecated; pass the targets on their own such as')
print()
print(' scons %s' % ' '.join(targets))
print()
COMMAND_LINE_TARGETS.append(targets)
@@ -152,8 +152,7 @@ try:
except ImportError:
pass
else:
aliases = default_ans.keys()
aliases.sort()
aliases = sorted(default_ans.keys())
env.Help('\n')
env.Help('Recognized targets:\n')
for alias in aliases:

View File

@@ -1 +1 @@
17.0.0-devel
17.3.0-devel

View File

@@ -34,13 +34,13 @@ branches:
clone_depth: 100
cache:
- win_flex_bison-2.4.5.zip
- win_flex_bison-2.5.9.zip
- llvm-3.3.1-msvc2013-mtd.7z
os: Visual Studio 2013
environment:
WINFLEXBISON_ARCHIVE: win_flex_bison-2.4.5.zip
WINFLEXBISON_ARCHIVE: win_flex_bison-2.5.9.zip
LLVM_ARCHIVE: llvm-3.3.1-msvc2013-mtd.7z
install:
@@ -48,11 +48,13 @@ install:
- python --version
- python -m pip --version
# Install Mako
- python -m pip install --egg Mako
- python -m pip install Mako==1.0.6
# Install pywin32 extensions, needed by SCons
- python -m pip install pypiwin32
# Install python wheels, necessary to install SCons via pip
- python -m pip install wheel
# Install SCons
- python -m pip install --egg scons==2.4.1
- python -m pip install scons==2.5.1
- scons --version
# Install flex/bison
- if not exist "%WINFLEXBISON_ARCHIVE%" appveyor DownloadFile "https://downloads.sourceforge.net/project/winflexbison/old_versions/%WINFLEXBISON_ARCHIVE%"

View File

@@ -1,3 +1,2 @@
[*.sh]
indent_style = space
indent_size = 2
indent_style = tab

View File

@@ -1,4 +1,4 @@
#!/bin/bash
#!/bin/sh
# This script is used to generate the list of fixed bugs that
# appears in the release notes files, with HTML formatting.
@@ -11,8 +11,6 @@
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 > bugfixes
# $ bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee bugfixes
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3
# $ DRYRUN=yes bin/bugzilla_mesa.sh mesa-9.0.2..mesa-9.0.3 | wc -l
# regex pattern: trim before bug number
@@ -21,29 +19,17 @@ trim_before='s/.*show_bug.cgi?id=\([0-9]*\).*/\1/'
# regex pattern: reconstruct the url
use_after='s,^,https://bugs.freedesktop.org/show_bug.cgi?id=,'
echo "<ul>"
echo ""
# extract fdo urls from commit log
urls=$(git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after)
# if DRYRUN is set to "yes", simply print the URLs and don't fetch the
# details from fdo bugzilla.
#DRYRUN=yes
if [ "x$DRYRUN" = xyes ]; then
for i in $urls
do
echo $i
done
else
echo "<ul>"
git log $* | grep 'bugs.freedesktop.org/show_bug' | sed -e $trim_before | sort -n -u | sed -e $use_after |\
while read url
do
id=$(echo $url | cut -d'=' -f2)
summary=$(wget --quiet -O - $url | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$url\">Bug $id</a> - $summary</li>"
echo ""
done
for i in $urls
do
id=$(echo $i | cut -d'=' -f2)
summary=$(wget --quiet -O - $i | grep -e '<title>.*</title>' | sed -e 's/ *<title>[0-9]\+ &ndash; \(.*\)<\/title>/\1/')
echo "<li><a href=\"$i\">Bug $id</a> - $summary</li>"
echo ""
done
echo "</ul>"
fi
echo "</ul>"

View File

@@ -10,26 +10,36 @@
# $ bin/get-extra-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
# XXX: there should be a better way for this
latest_branchpoint=`git branch | grep \* | cut -c 3-`-branchpoint
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' |\
cut -c -8 |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# For each cherry-picked commit...
cat already_picked | cut -c -8 |\
while read sha
do
# Check if the original commit is referenced in master
# ... check if it's referenced (fixed by another) patch
git log -n1 --pretty=oneline --grep=$sha $latest_branchpoint..origin/master |\
cut -c -8 |\
while read candidate
do
# Check if the potential fix, hasn't landed in branch yet.
found=`git log -n1 --pretty=oneline --reverse --grep=$candidate $latest_branchpoint..HEAD |wc -l`
if test $found = 0
then
echo Commit $candidate might need to be picked, as it references $sha
# And flag up if it hasn't landed in branch yet.
if grep -q ^$candidate already_picked ; then
continue
fi
# Or if it isn't in the ignore list.
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$candidate bin/.cherry-ignore ; then
continue
fi
fi
printf "Commit \"%s\" references %s\n" \
"`git log -n1 --pretty=oneline $candidate`" \
"$sha"
done
done
rm -f already_picked

81
bin/get-fixes-pick-list.sh Executable file
View File

@@ -0,0 +1,81 @@
#!/bin/sh
# Script for generating a list of candidates [referenced by a Fixes tag] for
# cherry-picking to a stable branch
#
# Usage examples:
#
# $ bin/get-fixes-pick-list.sh
# $ bin/get-fixes-pick-list.sh > picklist
# $ bin/get-fixes-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# List all the commits between day 1 and the branch point...
git log --reverse --pretty=%H $latest_branchpoint > already_landed
# ... and the ones cherry-picked.
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits with Fixes tag
git log --reverse --pretty=%H -i --grep="fixes:" $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list ...
if [ -f bin/.cherry-ignore ] ; then
if grep -q ^$sha bin/.cherry-ignore ; then
continue
fi
fi
# Skip if it has been already cherry-picked.
if grep -q ^$sha already_picked ; then
continue
fi
# Place every "fixes:" tag on its own line and join with the next word
# on its line or a later one.
fixes=`git show -s $sha | tr -d "\n" | sed -e 's/fixes:[[:space:]]*/\nfixes:/Ig' | grep "fixes:" | sed -e 's/\(fixes:[a-zA-Z0-9]*\).*$/\1/'`
# For each one try to extract the tag
fixes_count=`echo "$fixes" | wc -l`
warn=`(test $fixes_count -gt 1 && echo $fixes_count) || echo 0`
while [ $fixes_count -gt 0 ] ; do
# Treat only the current line
id=`echo "$fixes" | tail -n $fixes_count | head -n 1 | cut -d : -f 2`
fixes_count=$(($fixes_count-1))
# Bail out if we cannot find suitable id.
# Any specific validation the $id is valid and not some junk, is
# implied with the follow up code
if [ "x$id" = x ] ; then
continue
fi
# Check if the offending commit is in branch.
# Be that cherry-picked ...
# ... or landed before the branchpoint.
if grep -q ^$id already_picked ||
grep -q ^$id already_landed ; then
printf "Commit \"%s\" fixes %s\n" \
"`git log -n1 --pretty=oneline $sha`" \
"$id"
warn=$(($warn-1))
fi
done
if [ $warn -gt 0 ] ; then
printf "WARNING: Commit \"%s\" has more than one Fixes tag\n" \
"`git log -n1 --pretty=oneline $sha`"
fi
done
rm -f already_picked
rm -f already_landed

View File

@@ -8,13 +8,16 @@
# $ bin/get-pick-list.sh > picklist
# $ bin/get-pick-list.sh | tee picklist
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^\([[:space:]]*NOTE: .*[Cc]andidate\|CC:.*mesa-stable\)' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-stable' $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

View File

@@ -12,13 +12,16 @@
# This script intentionally _never_ checks for specific version tag
# Should we consider folding it with the original get-pick-list.sh
# Use the last branchpoint as our limit for the search
latest_branchpoint=`git merge-base origin/master HEAD`
# Grep for commits with "cherry picked from commit" in the commit message.
git log --reverse --grep="cherry picked from commit" origin/master..HEAD |\
git log --reverse --grep="cherry picked from commit" $latest_branchpoint..HEAD |\
grep "cherry picked from commit" |\
sed -e 's/^[[:space:]]*(cherry picked from commit[[:space:]]*//' -e 's/)//' > already_picked
# Grep for commits that were marked as a candidate for the stable tree.
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' HEAD..origin/master |\
git log --reverse --pretty=%H -i --grep='^CC:.*mesa-dev' $latest_branchpoint..origin/master |\
while read sha
do
# Check to see whether the patch is on the ignore list.

36
bin/git_sha1_gen.py Executable file
View File

@@ -0,0 +1,36 @@
#!/usr/bin/env python
"""
Generate the contents of the git_sha1.h file.
The output of this script goes to stdout.
"""
import os
import os.path
import subprocess
import sys
def get_git_sha1():
"""Try to get the git SHA1 with git rev-parse."""
git_dir = os.path.join(os.path.dirname(sys.argv[0]), '..', '.git')
try:
git_sha1 = subprocess.check_output([
'git',
'--git-dir=' + git_dir,
'rev-parse',
'HEAD',
], stderr=open(os.devnull, 'w')).decode("ascii")
except:
# don't print anything if it fails
git_sha1 = ''
return git_sha1
git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10]
if git_sha1:
git_sha1_h_in_path = os.path.join(os.path.dirname(sys.argv[0]),
'..', 'src', 'git_sha1.h.in')
with open(git_sha1_h_in_path , 'r') as git_sha1_h_in:
sys.stdout.write(git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1))

55
bin/install_megadrivers.py Executable file
View File

@@ -0,0 +1,55 @@
#!/usr/bin/env python
# encoding=utf-8
# Copyright © 2017 Intel Corporation
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
"""Script to install megadriver symlinks for meson."""
from __future__ import print_function
import argparse
import os
import shutil
def main():
parser = argparse.ArgumentParser()
parser.add_argument('megadriver')
parser.add_argument('libdir')
parser.add_argument('drivers', nargs='+')
args = parser.parse_args()
to = os.path.join(os.environ.get('MESON_INSTALL_DESTDIR_PREFIX'), args.libdir)
master = os.path.join(to, os.path.basename(args.megadriver))
if not os.path.exists(to):
os.makedirs(to)
shutil.copy(args.megadriver, master)
for each in args.drivers:
driver = os.path.join(to, each)
if os.path.exists(driver):
os.unlink(driver)
print('installing {} to {}'.format(args.megadriver, to))
os.link(master, driver)
os.unlink(master)
if __name__ == '__main__':
main()

View File

@@ -133,7 +133,7 @@ class PerfParser(LineParser):
def __init__(self, infile, symbol):
LineParser.__init__(self, infile)
self.symbol = symbol
self.symbol = symbol
def readline(self):
# Override LineParser.readline to ignore comment lines
@@ -155,7 +155,7 @@ class PerfParser(LineParser):
addresses.sort()
total_samples = 0
sys.stdout.write('%s:\n' % self.symbol)
sys.stdout.write('%s:\n' % self.symbol)
for address, instr in asm:
try:
sample = samples.pop(address)

View File

@@ -1,4 +1,4 @@
#!/bin/bash
#!/bin/sh
# This script is used to generate the list of changes that
# appears in the release notes files, with HTML formatting.
@@ -10,7 +10,7 @@
# $ bin/shortlog_mesa.sh mesa-9.0.2..mesa-9.0.3 | tee changes
typeset -i in_log=0
in_log=0
git shortlog $* | while read l
do

View File

@@ -0,0 +1,3 @@
{
radeon_drm_winsys_create;
};

View File

@@ -0,0 +1,6 @@
VERSION_1 {
global:
main;
local:
*;
};

File diff suppressed because it is too large Load Diff

View File

@@ -39,7 +39,7 @@ steps that work as of this writing.
get pywin32-218.4.win-amd64-py2.7.exe
- install git
- download mesa from git
see http://www.mesa3d.org/repository.html
see https://www.mesa3d.org/repository.html
- run scons
General

View File

@@ -33,7 +33,7 @@ without a depth buffer.
<p>
Mesa 9.1.2 and later (will) support a DRI configuration option to work around
this issue.
Using the <a href="http://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,
Using the <a href="https://dri.freedesktop.org/wiki/DriConf">driconf</a> tool,
set the "Create all visuals with a depth buffer" option before running Topogun.
Then, all GLX visuals will be created with a depth buffer.
</p>

View File

@@ -118,7 +118,7 @@ directories. For example, <code>LDFLAGS="-L/usr/X11R6/lib"</code>.</p>
<dt><code>PKG_CONFIG_PATH</code></dt>
<dd><p>The
<code>pkg-config</code> utility is a hard requirement for cofiguring and
<code>pkg-config</code> utility is a hard requirement for configuring and
building mesa. It is used to search for external libraries
on the system. This environment variable is used to control the search
path for <code>pkg-config</code>. For instance, setting
@@ -137,7 +137,7 @@ There are also a few general options for altering the Mesa build:
hasn't already set them via the CFLAGS/CXXFLAGS) and macros to aid in
debugging the Mesa libraries.</p>
<p>Note that enabling this option can lead to noticable loss of performance.</p>
<p>Note that enabling this option can lead to noticeable loss of performance.</p>
<dt><code>--disable-asm</code></dt>
<dd><p>There are assembly routines

View File

@@ -18,7 +18,7 @@
<p>
The Mesa bug database is hosted on
<a href="http://freedesktop.org">freedesktop.org</a>.
<a href="https://freedesktop.org">freedesktop.org</a>.
The old bug database on SourceForge is no longer used.
</p>
@@ -37,11 +37,14 @@ Please follow these bug reporting guidelines:
the problem.
<li>Check if your bug is already reported in the database.
<li>Monitor your bug report for requests for additional information, etc.
<li>Attach the output of running glxinfo or wglinfo.
This will tell us the Mesa version, which device driver you're using, etc.
<li>If you're reporting a crash, try to use your debugger (gdb) to get a stack
trace. Also, recompile Mesa in debug mode to get more detailed information.
<li>Describe in detail how to reproduce the bug, especially with games
and applications that the Mesa developers might not be familiar with.
<li>Provide a simple GLUT-based test program if possible
<li>Provide an <a href="https://github.com/apitrace/apitrace">apitrace</a>
or simple GLUT-based test program if possible.
</ul>
<p>

View File

@@ -58,7 +58,7 @@ and not <tt>a=b+c;</tt>
<li>Use comments wherever you think it would be helpful for other developers.
Several specific cases and style examples follow. Note that we roughly
follow <a href="http://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.
follow <a href="https://www.stack.nl/~dimitri/doxygen/">Doxygen</a> conventions.
<br>
<br>
Single-line comments:
@@ -120,7 +120,7 @@ the opening brace goes on the next line by itself (see above.)
_mesa_foo_bar() - an internal non-static Mesa function
</pre>
<li>Constants, macros and enumerant names are ALL_UPPERCASE, with _ between
<li>Constants, macros and enum names are ALL_UPPERCASE, with _ between
words.
<li>Mesa usually uses camel case for local variables (Ex: "localVarname")
while gallium typically uses underscores (Ex: "local_var_name").

View File

@@ -53,7 +53,7 @@
<li><a href="lists.html" target="_parent">Mailing Lists</a>
<li><a href="bugs.html" target="_parent">Bug Database</a>
<li><a href="webmaster.html" target="_parent">Webmaster</a>
<li><a href="http://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>
<li><a href="https://dri.freedesktop.org/" target="_parent">Mesa/DRI Wiki</a>
</ul>
<b>User Topics</b>
@@ -83,23 +83,24 @@
<li><a href="devinfo.html" target="_parent">Development Notes</a>
<li><a href="codingstyle.html" target="_parent">Coding Style</a>
<li><a href="submittingpatches.html" target="_parent">Submitting patches</a>
<li><a href="releasing.html" target="_parent">Releasing process</a>
<li><a href="release-calendar.html" target="_parent">Release calendar</a>
<li><a href="sourcedocs.html" target="_parent">Source Documentation</a>
<li><a href="dispatch.html" target="_parent">GL Dispatch</a>
</ul>
<b>Links</b>
<ul>
<li><a href="http://www.opengl.org" target="_parent">OpenGL website</a>
<li><a href="http://dri.freedesktop.org" target="_parent">DRI website</a>
<li><a href="http://www.freedesktop.org" target="_parent">freedesktop.org</a>
<li><a href="http://planet.freedesktop.org" target="_parent">Developer blogs</a>
<li><a href="https://www.opengl.org" target="_parent">OpenGL website</a>
<li><a href="https://dri.freedesktop.org" target="_parent">DRI website</a>
<li><a href="https://www.freedesktop.org" target="_parent">freedesktop.org</a>
<li><a href="https://planet.freedesktop.org" target="_parent">Developer blogs</a>
</ul>
<b>Hosted by:</b>
<br>
<blockquote>
<a href="http://sourceforge.net"
target="_parent">sourceforge.net</a>
<a href="https://freedesktop.org" target="_parent">freedesktop.org</a>
</blockquote>
</body>

View File

@@ -20,7 +20,7 @@
Both professional and volunteer developers contribute to Mesa.
</p>
<p>
<a href="http://www.vmware.com/">VMware</a>
<a href="https://www.vmware.com/">VMware</a>
employs several of the main Mesa developers including Brian Paul
and Keith Whitwell.
</p>
@@ -44,7 +44,7 @@ Intel has recently contributed the new GLSL compiler in Mesa 7.9.
</p>
<p>
<a href="http://www.lunarg.com/">LunarG</a> can be contacted
<a href="https://www.lunarg.com/">LunarG</a> can be contacted
for custom Mesa / 3D graphics development.
</p>

View File

@@ -20,47 +20,40 @@
Primary Mesa download site:
<a href="ftp://ftp.freedesktop.org/pub/mesa/">ftp.freedesktop.org</a> (FTP)
or <a href="https://mesa.freedesktop.org/archive/">mesa.freedesktop.org</a>
(HTTP).
(HTTPS).
</p>
<p>
Starting with the first release of 2017, Mesa's version scheme is
year-based. Filenames are in the form <tt>mesa-Y.N.P.tar.gz</tt>, where
<tt>Y</tt> is the year (two digits), <tt>N</tt> is an incremental number
(starting at 0) and <tt>P</tt> is the patch number (0 for the first
release, 1 for the first patch after that).
</p>
<p>
When a new release is coming, release candidates (betas) may be found
<a href="ftp://ftp.freedesktop.org/pub/mesa/beta/">here</a>.
in the same directory, and are recognisable by the
<tt>mesa-Y.N.P-<b>rc</b>X.tar.gz</tt> filename.
</p>
<h1>Unpacking</h1>
<p>
Mesa releases are available in three formats: .tar.bz2, .tar.gz, and .zip
Mesa releases are available in two formats: <tt>.tar.xz</tt> and <tt>.tar.gz</tt>.
</p>
<p>
To unpack .tar.gz files:
</p>
To unpack the tarball:
<pre>
tar zxf MesaLib-x.y.z.tar.gz
tar xf mesa-Y.N.P.tar.xz
</pre>
or
<pre>
gzcat MesaLib-x.y.z.tar.gz | tar xf -
tar xf mesa-Y.N.P.tar.gz
</pre>
or
<pre>
gunzip MesaLib-x.y.z.tar.gz ; tar xf MesaLib-x.y.z.tar
</pre>
<p>
To unpack .tar.bz2 files:
</p>
<pre>
bunzip2 -c MesaLib-x.y.z.tar.gz | tar xf -
</pre>
<p>
To unpack .zip files:
</p>
<pre>
unzip MesaLib-x.y.z.zip
</pre>
<h1>Contents</h1>
@@ -69,8 +62,8 @@ To unpack .zip files:
After unpacking you'll have these files and directories (among others):
</p>
<pre>
Makefile - top-level Makefile for most systems
configs/ - makefile parameter files for various systems
autogen.sh - Autoconf script for *nix systems
scons/ - SCons script for Windows builds
include/ - GL header (include) files
bin/ - shell scripts for making shared libraries, etc
docs/ - documentation
@@ -109,9 +102,9 @@ In the past, GLUT, GLU and the Mesa demos were released in conjunction with
Mesa releases. But since GLUT, GLU and the demos change infrequently, they
were split off into their own git repositories:
<a href="http://cgit.freedesktop.org/mesa/glut/">GLUT</a>,
<a href="http://cgit.freedesktop.org/mesa/glu/">GLU</a> and
<a href="http://cgit.freedesktop.org/mesa/demos/">Demos</a>,
<a href="https://cgit.freedesktop.org/mesa/glut/">GLUT</a>,
<a href="https://cgit.freedesktop.org/mesa/glu/">GLU</a> and
<a href="https://cgit.freedesktop.org/mesa/demos/">Demos</a>,
</p>
</div>

View File

@@ -18,8 +18,8 @@
<p>The current version of EGL in Mesa implements EGL 1.4. More information
about EGL can be found at
<a href="http://www.khronos.org/egl/">
http://www.khronos.org/egl/</a>.</p>
<a href="https://www.khronos.org/egl/">
https://www.khronos.org/egl/</a>.</p>
<p>The Mesa's implementation of EGL uses a driver architecture. The main
library (<code>libEGL</code>) is window system neutral. It provides the EGL
@@ -44,7 +44,7 @@ the driver for your hardware. For example</p>
<p>The main library and OpenGL is enabled by default. The first two options
above enables <a href="opengles.html">OpenGL ES 1.x and 2.x</a>. The last two
options enables the listed classic and and Gallium drivers respectively.</p>
options enables the listed classic and Gallium drivers respectively.</p>
</li>
@@ -77,15 +77,13 @@ drivers will be installed to <code>${libdir}/egl</code>.</p>
</dd>
<dt><code>--with-egl-platforms</code></dt>
<dt><code>--with-platforms</code></dt>
<dd>
<p>List the platforms (window systems) to support. Its argument is a comma
separated string such as <code>--with-egl-platforms=x11,drm</code>. It decides
separated string such as <code>--with-platforms=x11,drm</code>. It decides
the platforms a driver may support. The first listed platform is also used by
the main library to decide the native platform: the platform the EGL native
types such as <code>EGLNativeDisplayType</code> or
<code>EGLNativeWindowType</code> defined for.</p>
the main library to decide the native platform.</p>
<p>The available platforms are <code>x11</code>, <code>drm</code>,
<code>wayland</code>, <code>surfaceless</code>, <code>android</code>,
@@ -132,44 +130,13 @@ mesa/demos repository.</p>
runtime</p>
<dl>
<dt><code>EGL_DRIVERS_PATH</code></dt>
<dd>
<p>By default, the main library will look for drivers in the directory where
the drivers are installed to. This variable specifies a list of
colon-separated directories where the main library will look for drivers, in
addition to the default directory. This variable is ignored for setuid/setgid
binaries.</p>
<p>This variable is usually set to test an uninstalled build. For example, one
may set</p>
<pre>
$ export LD_LIBRARY_PATH=$mesa/lib
$ export EGL_DRIVERS_PATH=$mesa/lib/egl
</pre>
<p>to test a build without installation</p>
</dd>
<dt><code>EGL_DRIVER</code></dt>
<dd>
<p>This variable specifies a full path to or the name of an EGL driver. It
forces the specified EGL driver to be loaded. It comes in handy when one wants
to test a specific driver. This variable is ignored for setuid/setgid
binaries.</p>
</dd>
<dt><code>EGL_PLATFORM</code></dt>
<dd>
<p>This variable specifies the native platform. The valid values are the same
as those for <code>--with-egl-platforms</code>. When the variable is not set,
as those for <code>--with-platforms</code>. When the variable is not set,
the main library uses the first platform listed in
<code>--with-egl-platforms</code> as the native platform.</p>
<code>--with-platforms</code> as the native platform.</p>
<p>Extensions like <code>EGL_MESA_drm_display</code> define new functions to
create displays for non-native platforms. These extensions are usually used by

View File

@@ -29,12 +29,12 @@ sometimes be useful for debugging end-user issues.
<li>LIBGL_DEBUG - If defined debug information will be printed to stderr.
If set to 'verbose' additional information will be printed.
<li>LIBGL_DRIVERS_PATH - colon-separated list of paths to search for DRI drivers
<li>LIBGL_ALWAYS_INDIRECT - forces an indirect rendering context/connection.
<li>LIBGL_ALWAYS_SOFTWARE - if set, always use software rendering
<li>LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for debugging)
<li>LIBGL_ALWAYS_INDIRECT - if set to `true`, forces an indirect rendering context/connection.
<li>LIBGL_ALWAYS_SOFTWARE - if set to `true`, always use software rendering
<li>LIBGL_NO_DRAWARRAYS - if set to `true`, do not use DrawArrays GLX protocol (for debugging)
<li>LIBGL_SHOW_FPS - print framerate to stdout based on the number of glXSwapBuffers
calls per second.
<li>LIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
<li>LIBGL_DRI3_DISABLE - disable DRI3 if set to `true`.
</ul>
@@ -46,6 +46,9 @@ sometimes be useful for debugging end-user issues.
<li>MESA_NO_MMX - if set, disables Intel MMX optimizations
<li>MESA_NO_3DNOW - if set, disables AMD 3DNow! optimizations
<li>MESA_NO_SSE - if set, disables Intel SSE optimizations
<li>MESA_NO_ERROR - if set to 1, error checking is disabled as per KHR_no_error.
This will result in undefined behaviour for invalid use of the api, but
can reduce CPU use for apps that are known to be error free.</li>
<li>MESA_DEBUG - if set, error messages are printed to stderr. For example,
if the application generates a GL_INVALID_ENUM error, a corresponding error
message indicating where the error occurred, and possibly why, will be
@@ -114,8 +117,24 @@ glGetString(GL_VERSION) for OpenGL ES.
glGetString(GL_SHADING_LANGUAGE_VERSION). Valid values are integers, such as
"130". Mesa will not really implement all the features of the given language version
if it's higher than what's normally reported. (for developers only)
<li>MESA_GLSL_CACHE_DISABLE - if set to `true`, disables the GLSL shader cache
<li>MESA_GLSL_CACHE_MAX_SIZE - if set, determines the maximum size of
the on-disk cache of compiled GLSL programs. Should be set to a number
optionally followed by 'K', 'M', or 'G' to specify a size in
kilobytes, megabytes, or gigabytes. By default, gigabytes will be
assumed. And if unset, a maximum size of 1GB will be used. Note: A separate
cache might be created for each architecture that Mesa is installed for on
your system. For example under the default settings you may end up with a 1GB
cache for x86_64 and another 1GB cache for i386.
<li>MESA_GLSL_CACHE_DIR - if set, determines the directory to be used
for the on-disk cache of compiled GLSL programs. If this variable is
not set, then the cache will be stored in $XDG_CACHE_HOME/mesa (if
that variable is set), or else within .cache/mesa within the user's
home directory.
<li>MESA_GLSL - <a href="shading.html#envvars">shading language compiler options</a>
<li>MESA_NO_MINMAX_CACHE - when set, the minmax index cache is globally disabled.
<li>MESA_SHADER_CAPTURE_PATH - see <a href="shading.html#capture">Capturing Shaders</a></li>
<li>MESA_SHADER_DUMP_PATH and MESA_SHADER_READ_PATH - see <a href="shading.html#replacement">Experimenting with Shader Replacements</a></li>
</ul>
@@ -146,47 +165,49 @@ See the <a href="xlibdriver.html">Xlib software driver page</a> for details.
This is useful for debugging hangs, etc.</li>
<li>INTEL_DEBUG - a comma-separated list of named flags, which do various things:
<ul>
<li>tex - emit messages about textures.</li>
<li>state - emit messages about state flag tracking</li>
<li>blit - emit messages about blit operations</li>
<li>miptree - emit messages about miptrees</li>
<li>perf - emit messages about performance issues</li>
<li>perfmon - emit messages about AMD_performance_monitor</li>
<li>ann - annotate IR in assembly dumps</li>
<li>aub - dump batches into an AUB trace for use with simulation tools</li>
<li>bat - emit batch information</li>
<li>pix - emit messages about pixel operations</li>
<li>blit - emit messages about blit operations</li>
<li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>
<li>buf - emit messages about buffer objects</li>
<li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>
<li>color - use color in output</li>
<li>cs - dump shader assembly for compute shaders</li>
<li>do32 - generate compute shader SIMD32 programs even if workgroup size doesn't exceed the SIMD16 limit</li>
<li>dri - emit messages about the DRI interface</li>
<li>fbo - emit messages about framebuffers</li>
<li>fs - dump shader assembly for fragment shaders</li>
<li>gs - dump shader assembly for geometry shaders</li>
<li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>
<li>prim - emit messages about drawing primitives</li>
<li>vert - emit messages about vertex assembly</li>
<li>dri - emit messages about the DRI interface</li>
<li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>
<li>stats - enable statistics counters. you probably actually want perfmon or intel_gpu_top instead.</li>
<li>urb - emit messages about URB setup</li>
<li>vs - dump shader assembly for vertex shaders</li>
<li>clip - emit messages about the clip unit (for old gens, includes the CLIP program)</li>
<li>aub - dump batches into an AUB trace for use with simulation tools</li>
<li>shader_time - record how much GPU time is spent in each shader</li>
<li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>
<li>blorp - emit messages about the blorp operations (blits &amp; clears)</li>
<li>nodualobj - suppress generation of dual-object geometry shader code</li>
<li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>
<li>ann - annotate IR in assembly dumps</li>
<li>hex - print instruction hex dump with the disassembly</li>
<li>l3 - emit messages about the new L3 state during transitions</li>
<li>miptree - emit messages about miptrees</li>
<li>no8 - don't generate SIMD8 fragment shader</li>
<li>vec4 - force vec4 mode in vertex shader</li>
<li>no16 - suppress generation of 16-wide fragment shaders. useful for debugging broken shaders</li>
<li>nocompact - disable instruction compaction</li>
<li>nodualobj - suppress generation of dual-object geometry shader code</li>
<li>norbc - disable single sampled render buffer compression</li>
<li>optimizer - dump shader assembly to files at each optimization pass and iteration that make progress</li>
<li>perf - emit messages about performance issues</li>
<li>perfmon - emit messages about AMD_performance_monitor</li>
<li>pix - emit messages about pixel operations</li>
<li>prim - emit messages about drawing primitives</li>
<li>reemit - mark all state dirty on each draw call</li>
<li>sf - emit messages about the strips &amp; fans unit (for old gens, includes the SF program)</li>
<li>shader_time - record how much GPU time is spent in each shader</li>
<li>spill_fs - force spilling of all registers in the scalar backend (useful to debug spilling code)</li>
<li>spill_vec4 - force spilling of all registers in the vec4 backend (useful to debug spilling code)</li>
<li>cs - dump shader assembly for compute shaders</li>
<li>hex - print instruction hex dump with the disassembly</li>
<li>nocompact - disable instruction compaction</li>
<li>state - emit messages about state flag tracking</li>
<li>submit - emit batchbuffer usage statistics</li>
<li>sync - after sending each batch, emit a message and wait for that batch to finish rendering</li>
<li>tcs - dump shader assembly for tessellation control shaders</li>
<li>tes - dump shader assembly for tessellation evaluation shaders</li>
<li>l3 - emit messages about the new L3 state during transitions</li>
<li>do32 - generate compute shader SIMD32 programs even if workgroup size doesn't exceed the SIMD16 limit</li>
<li>norbc - disable single sampled render buffer compression</li>
<li>tex - emit messages about textures.</li>
<li>urb - emit messages about URB setup</li>
<li>vert - emit messages about vertex assembly</li>
<li>vs - dump shader assembly for vertex shaders</li>
</ul>
<li>INTEL_SCALAR_VS (or TCS, TES, GS) - force scalar/vec4 mode for a shader stage (Gen8-9 only)</li>
<li>INTEL_PRECISE_TRIG - if set to 1, true or yes, then the driver prefers
accuracy over performance in trig functions.</li>
</ul>
@@ -223,7 +244,7 @@ Mesa EGL supports different sets of environment variables. See the
Use kill -10 <pid> to toggle the hud as desired.
<li>GALLIUM_HUD_DUMP_DIR - specifies a directory for writing the displayed
hud values into files.
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=1 for
<li>GALLIUM_DRIVER - useful in combination with LIBGL_ALWAYS_SOFTWARE=true for
choosing one of the software renderers "softpipe", "llvmpipe" or "swr".
<li>GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, etc.
rather than stderr.
@@ -287,6 +308,8 @@ See src/mesa/state_tracker/st_debug.c for other options.
(will often result in incorrect rendering).
<li>SVGA_DEBUG - for dumping shaders, constant buffers, etc. See the code
for details.
<li>SVGA_EXTRA_LOGGING - if set, enables extra logging to the vmware.log file,
such as the OpenGL program's name and command line arguments.
<li>See the driver code for other, lesser-used variables.
</ul>

View File

@@ -41,7 +41,7 @@ Last updated: 9 October 2012
<p>
Mesa is an open-source implementation of the OpenGL specification.
OpenGL is a programming library for writing interactive 3D applications.
See the <a href="http://www.opengl.org/">OpenGL website</a> for more
See the <a href="https://www.opengl.org/">OpenGL website</a> for more
information.
</p>
<p>
@@ -55,13 +55,13 @@ Yes. Specifically, Mesa serves as the OpenGL core for the open-source DRI
drivers for X.org.
</p>
<ul>
<li>See the <a href="http://dri.freedesktop.org/">DRI website</a>
<li>See the <a href="https://dri.freedesktop.org/">DRI website</a>
for more information.</li>
<li>See <a href="https://01.org/linuxgraphics">01.org</a>
for more information about Intel drivers.</li>
<li>See <a href="http://nouveau.freedesktop.org">nouveau.freedesktop.org</a>
<li>See <a href="https://nouveau.freedesktop.org">nouveau.freedesktop.org</a>
for more information about Nouveau drivers.</li>
<li>See <a href="http://www.x.org/wiki/RadeonFeature">www.x.org/wiki/RadeonFeature</a>
<li>See <a href="https://www.x.org/wiki/RadeonFeature">www.x.org/wiki/RadeonFeature</a>
for more information about Radeon drivers.</li>
</ul>
@@ -144,7 +144,7 @@ Mesa is much more up to date with modern features and extensions.
</p>
<p>
<a href="http://sourceforge.net/projects/ogl-es/">Vincent</a> is
<a href="https://sourceforge.net/projects/ogl-es/">Vincent</a> is
an open-source implementation of OpenGL ES for mobile devices.
<p>
@@ -157,7 +157,7 @@ is a subset of OpenGL.
</p>
<p>
<a href="http://sourceforge.net/projects/softgl/">SoftGL</a>
<a href="https://sourceforge.net/projects/softgl/">SoftGL</a>
is an OpenGL subset for mobile devices.
</p>
@@ -213,7 +213,7 @@ If you don't already have GLUT installed, you should grab
<h2>2.4 Where is the GLw library?</h2>
<p>
GLw (OpenGL widget library) is now available from a separate <a href="http://cgit.freedesktop.org/mesa/glw/">git repository</a>. Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.
GLw (OpenGL widget library) is now available from a separate <a href="https://cgit.freedesktop.org/mesa/glw/">git repository</a>. Unless you're using very old Xt/Motif applications with OpenGL, you shouldn't need it.
</p>
@@ -276,7 +276,7 @@ If you're using a hardware accelerated driver you want <code>direct rendering: Y
</p>
<p>
If your DRI-based driver isn't working, go to the
<a href="http://dri.freedesktop.org/">DRI website</a> for trouble-shooting information.
<a href="https://dri.freedesktop.org/">DRI website</a> for trouble-shooting information.
</p>
@@ -284,7 +284,7 @@ If your DRI-based driver isn't working, go to the
<p>
Make sure the ratio of the far to near clipping planes isn't too great.
Look
<a href="http://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>
<a href="https://www.opengl.org/resources/faq/technical/depthbuffer.htm#0040">here</a>
for details.
</p>
<p>
@@ -339,7 +339,7 @@ First, join the <a href="lists.html">mesa-dev mailing list</a>.
That's where Mesa development is discussed.
</p>
<p>
The <a href="http://www.opengl.org/documentation">
The <a href="https://www.opengl.org/documentation">
OpenGL Specification</a> is the bible for OpenGL implementation work.
You should read it.
</p>
@@ -383,7 +383,7 @@ implement the extension (specifically the compression/decompression
algorithms).
</p>
<p>
In the mean time, a 3rd party <a href="http://dri.freedesktop.org/wiki/S3TC">
In the mean time, a 3rd party <a href="https://dri.freedesktop.org/wiki/S3TC">
plug-in library</a> is available.
</p>

View File

@@ -78,18 +78,18 @@ GL 3.1, GLSL 1.40 --- all DONE: freedreno, i965, nv50, nvc0, r600, radeonsi, llv
GL_EXT_texture_snorm (Signed normalized textures) DONE ()
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe
GL 3.2, GLSL 1.50 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, softpipe, swr
Core/compatibility profiles DONE
Geometry shaders DONE ()
GL_ARB_vertex_array_bgra (BGRA vertex order) DONE (freedreno, swr)
GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (freedreno, swr)
GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno, swr)
GL_ARB_provoking_vertex (Provoking vertex) DONE (freedreno, swr)
GL_ARB_seamless_cube_map (Seamless cubemaps) DONE (freedreno, swr)
GL_ARB_texture_multisample (Multisample textures) DONE (swr)
GL_ARB_depth_clamp (Frag depth clamp) DONE (freedreno, swr)
GL_ARB_sync (Fence objects) DONE (freedreno, swr)
GL_ARB_vertex_array_bgra (BGRA vertex order) DONE (freedreno)
GL_ARB_draw_elements_base_vertex (Base vertex offset) DONE (freedreno)
GL_ARB_fragment_coord_conventions (Frag shader coord) DONE (freedreno)
GL_ARB_provoking_vertex (Provoking vertex) DONE (freedreno)
GL_ARB_seamless_cube_map (Seamless cubemaps) DONE (freedreno)
GL_ARB_texture_multisample (Multisample textures) DONE ()
GL_ARB_depth_clamp (Frag depth clamp) DONE (freedreno)
GL_ARB_sync (Fence objects) DONE (freedreno)
GLX_ARB_create_context_profile DONE
@@ -107,7 +107,7 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft
GL_ARB_vertex_type_2_10_10_10_rev DONE (freedreno, swr)
GL 4.0, GLSL 4.00 --- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_draw_buffers_blend DONE (freedreno, i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_draw_indirect DONE (i965/gen7+, llvmpipe, softpipe, swr)
@@ -124,29 +124,29 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/hsw+, nvc0, r600, radeonsi
- Enhanced per-sample shading DONE ()
- Interpolation functions DONE ()
- New overload resolution rules DONE
GL_ARB_gpu_shader_fp64 DONE (i965/hsw+, llvmpipe, softpipe)
GL_ARB_gpu_shader_fp64 DONE (i965/gen7+, llvmpipe, softpipe)
GL_ARB_sample_shading DONE (i965/gen6+, nv50)
GL_ARB_shader_subroutine DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_tessellation_shader DONE (i965/gen7+)
GL_ARB_texture_buffer_object_rgb32 DONE (i965/gen6+, llvmpipe, softpipe, swr)
GL_ARB_texture_cube_map_array DONE (i965/gen6+, nv50, llvmpipe, softpipe)
GL_ARB_texture_gather DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (i965, nv50, softpipe)
GL_ARB_transform_feedback2 DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr)
GL_ARB_texture_query_lod DONE (i965, nv50, llvmpipe, softpipe)
GL_ARB_transform_feedback2 DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr)
GL_ARB_transform_feedback3 DONE (i965/gen7+, llvmpipe, softpipe, swr)
GL 4.1, GLSL 4.10 --- all DONE: i965/hsw+, nvc0, r600, radeonsi
GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
GL_ARB_ES2_compatibility DONE (i965, nv50, llvmpipe, softpipe, swr)
GL_ARB_get_program_binary DONE (0 binary formats)
GL_ARB_separate_shader_objects DONE (all drivers)
GL_ARB_shader_precision DONE (i965/hsw+, all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (i965/hsw+, llvmpipe, softpipe)
GL_ARB_shader_precision DONE (i965/gen7+, all drivers that support GLSL 4.10)
GL_ARB_vertex_attrib_64bit DONE (i965/gen7+, llvmpipe, softpipe)
GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe)
GL 4.2, GLSL 4.20 -- all DONE: i965/hsw+, nvc0, radeonsi
GL 4.2, GLSL 4.20 -- all DONE: i965/gen7+, nvc0, radeonsi
GL_ARB_texture_compression_bptc DONE (i965, r600)
GL_ARB_compressed_texture_pixel_storage DONE (all drivers)
@@ -191,8 +191,8 @@ GL 4.3, GLSL 4.30 -- all DONE: i965/gen8+, nvc0, radeonsi
GL 4.4, GLSL 4.40 -- all DONE: i965/gen8+, nvc0, radeonsi
GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers)
GL_ARB_buffer_storage DONE (i965, nv50, r600)
GL_ARB_clear_texture DONE (i965, nv50, r600)
GL_ARB_buffer_storage DONE (i965, nv50, r600, llvmpipe, swr)
GL_ARB_clear_texture DONE (i965, nv50, r600, llvmpipe, softpipe, swr)
GL_ARB_enhanced_layouts DONE (i965, nv50, llvmpipe, softpipe)
- compile-time constant expressions DONE
- explicit byte offsets for blocks DONE
@@ -221,6 +221,22 @@ GL 4.5, GLSL 4.50 -- all DONE: nvc0, radeonsi
GL_KHR_robustness DONE (i965)
GL_EXT_shader_integer_mix DONE (all drivers that support GLSL)
GL 4.6, GLSL 4.60
GL_ARB_gl_spirv in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_indirect_parameters DONE (i965/gen7+, nvc0, radeonsi)
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, radeonsi, llvmpipe, softpipe, swr)
GL_ARB_polygon_offset_clamp DONE (i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr)
GL_ARB_shader_atomic_counter_ops DONE (i965/gen7+, nvc0, radeonsi, softpipe)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (i965, nvc0, radeonsi)
GL_ARB_spirv_extensions in progress (Nicolai Hähnle, Ian Romanick)
GL_ARB_texture_filter_anisotropic DONE (i965, nv50, nvc0, r600, radeonsi, softpipe (*), llvmpipe (*))
GL_ARB_transform_feedback_overflow_query DONE (i965/gen6+, radeonsi, llvmpipe, softpipe)
GL_KHR_no_error started (Timothy Arceri)
(*) softpipe and llvmpipe advertise 16x anisotropy but simply ignore the setting
These are the extensions cherry-picked to make GLES 3.1
GLES3.1, GLSL ES 3.1 -- all DONE: i965/hsw+, nvc0, radeonsi
@@ -277,41 +293,39 @@ GLES3.2, GLSL ES 3.2 -- all DONE: i965/gen9+
Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version:
GL_ARB_bindless_texture started (airlied)
GL_ARB_bindless_texture DONE (radeonsi)
GL_ARB_cl_event not started
GL_ARB_compute_variable_group_size DONE (nvc0, radeonsi)
GL_ARB_ES3_2_compatibility DONE (i965/gen8+)
GL_ARB_fragment_shader_interlock not started
GL_ARB_gl_spirv not started
GL_ARB_gpu_shader_int64 started (airlied for core and Gallium, idr for i965)
GL_ARB_indirect_parameters DONE (nvc0, radeonsi)
GL_ARB_gpu_shader_int64 DONE (i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe)
GL_ARB_parallel_shader_compile not started, but Chia-I Wu did some related work in 2014
GL_ARB_pipeline_statistics_query DONE (i965, nvc0, radeonsi, softpipe, swr)
GL_ARB_post_depth_coverage DONE (i965)
GL_ARB_robustness_isolation not started
GL_ARB_sample_locations not started
GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr)
GL_ARB_shader_atomic_counter_ops DONE (nvc0, radeonsi, softpipe)
GL_ARB_shader_ballot not started
GL_ARB_shader_clock DONE (i965/gen7+)
GL_ARB_shader_draw_parameters DONE (i965, nvc0, radeonsi)
GL_ARB_shader_group_vote DONE (nvc0)
GL_ARB_shader_ballot DONE (i965/gen8+, nvc0, radeonsi)
GL_ARB_shader_clock DONE (i965/gen7+, nv50, nvc0, radeonsi)
GL_ARB_shader_stencil_export DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr)
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+)
GL_ARB_sparse_buffer not started
GL_ARB_shader_viewport_layer_array DONE (i965/gen6+, nvc0, radeonsi)
GL_ARB_sparse_buffer DONE (radeonsi/CIK+)
GL_ARB_sparse_texture not started
GL_ARB_sparse_texture2 not started
GL_ARB_sparse_texture_clamp not started
GL_ARB_texture_filter_minmax not started
GL_ARB_transform_feedback_overflow_query not started
GL_EXT_memory_object DONE (radeonsi)
GL_EXT_memory_object_fd DONE (radeonsi)
GL_EXT_memory_object_win32 not started
GL_EXT_semaphore not started
GL_EXT_semaphore_fd not started
GL_EXT_semaphore_win32 not started
GL_KHR_blend_equation_advanced_coherent DONE (i965/gen9+)
GL_KHR_no_error not started
GL_KHR_texture_compression_astc_hdr DONE (core only)
GL_KHR_texture_compression_astc_sliced_3d not started
GL_KHR_texture_compression_astc_hdr DONE (i965/bxt)
GL_KHR_texture_compression_astc_sliced_3d DONE (i965/gen9+)
GL_OES_depth_texture_cube_map DONE (all drivers that support GLSL 1.30+)
GL_OES_EGL_image DONE (all drivers)
GL_OES_EGL_image_external_essl3 not started
GL_OES_required_internalformat not started - GLES2 extension based on OpenGL ES 3.0 feature
GL_OES_required_internalformat DONE (all drivers)
GL_OES_surfaceless_context DONE (all drivers)
GL_OES_texture_compression_astc DONE (core only)
GL_OES_texture_float DONE (i965, r300, r600, radeonsi, nv30, nv50, nvc0, softpipe, llvmpipe)
@@ -333,5 +347,47 @@ we DO NOT WANT implementations of these extensions for Mesa.
GL_ARB_shadow_ambient Superseded by GL_ARB_fragment_program
GL_ARB_vertex_blend Superseded by GL_ARB_vertex_program
More info about these features and the work involved can be found at
http://dri.freedesktop.org/wiki/MissingFunctionality
Vulkan 1.0 -- all DONE: anv, radv
Khronos extensions that are not part of any Vulkan version:
VK_KHR_16bit_storage in progress (Alejandro)
VK_KHR_android_surface not started
VK_KHR_dedicated_allocation DONE (anv, radv)
VK_KHR_descriptor_update_template DONE (anv, radv)
VK_KHR_display not started
VK_KHR_display_swapchain not started
VK_KHR_external_fence not started
VK_KHR_external_fence_capabilities not started
VK_KHR_external_fence_fd not started
VK_KHR_external_fence_win32 not started
VK_KHR_external_memory DONE (anv, radv)
VK_KHR_external_memory_capabilities DONE (anv, radv)
VK_KHR_external_memory_fd DONE (anv, radv)
VK_KHR_external_memory_win32 not started
VK_KHR_external_semaphore DONE (radv)
VK_KHR_external_semaphore_capabilities DONE (radv)
VK_KHR_external_semaphore_fd DONE (radv)
VK_KHR_external_semaphore_win32 not started
VK_KHR_get_memory_requirements2 DONE (anv, radv)
VK_KHR_get_physical_device_properties2 DONE (anv, radv)
VK_KHR_get_surface_capabilities2 DONE (anv)
VK_KHR_incremental_present DONE (anv, radv)
VK_KHR_maintenance1 DONE (anv, radv)
VK_KHR_mir_surface not started
VK_KHR_push_descriptor DONE (anv, radv)
VK_KHR_sampler_mirror_clamp_to_edge DONE (anv, radv)
VK_KHR_shader_draw_parameters DONE (anv, radv)
VK_KHR_shared_presentable_image not started
VK_KHR_storage_buffer_storage_class DONE (anv, radv)
VK_KHR_surface DONE (anv, radv)
VK_KHR_swapchain DONE (anv, radv)
VK_KHR_variable_pointers DONE (anv, radv)
VK_KHR_wayland_surface DONE (anv, radv)
VK_KHR_win32_keyed_mutex not started
VK_KHR_win32_surface not started
VK_KHR_xcb_surface DONE (anv, radv)
VK_KHR_xlib_surface DONE (anv, radv)
A graphical representation of this information can be found at
https://mesamatrix.net/

View File

@@ -24,7 +24,7 @@ Here are some specific ideas and areas where help would be appreciated:
<ol>
<li>
<b>Driver patching and testing.</b>
Patches are often posted to the <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev mailing list</a>, but aren't
Patches are often posted to the <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev mailing list</a>, but aren't
immediately checked into git because not enough people are testing them.
Just applying patches, testing and reporting back is helpful.
<li>
@@ -35,17 +35,8 @@ There are plenty of open bugs in the <a href="https://bugs.freedesktop.org/descr
Enable gcc -Wstrict-aliasing=2 -fstrict-aliasing and track down aliasing
issues in the code.
<li>
<b>Windows driver building, testing and maintenance.</b>
Fixing MSVC builds.
<li>
<b>Contribute more tests to
<a href="http://piglit.freedesktop.org/">Piglit</a>.</b>
<li>
<b>Automatic testing.
</b>
It would be great if someone would set up an automated system for grabbing
the latest Mesa code and run tests (such as piglit) then report issues to
the mailing list.
<a href="https://piglit.freedesktop.org/">Piglit</a>.</b>
</ol>
<p>
@@ -56,26 +47,18 @@ You can find some further To-do lists here:
<b>Common To-Do lists:</b>
</p>
<ul>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">
<li><a href="https://cgit.freedesktop.org/mesa/mesa/tree/docs/features.txt">
<b>features.txt</b></a> - Status of OpenGL 3.x / 4.x features in Mesa.</li>
<li><a href="http://dri.freedesktop.org/wiki/MissingFunctionality">
<b>MissingFunctionality</b></a> - Detailed information about missing OpenGL features.</li>
</ul>
<p>
<b>Driver specific To-Do lists:</b>
<b>Legacy Driver specific To-Do lists:</b>
</p>
<ul>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/docs/llvm-todo.txt">
<b>LLVMpipe</b></a> - Software driver using LLVM for runtime code generation.</li>
<li><a href="http://dri.freedesktop.org/wiki/RadeonsiToDo">
<b>radeonsi</b></a> - Driver for AMD Southern Island.</li>
<li><a href="http://dri.freedesktop.org/wiki/R600ToDo">
<li><a href="https://dri.freedesktop.org/wiki/R600ToDo">
<b>r600g</b></a> - Driver for ATI/AMD R600 - Northern Island.</li>
<li><a href="http://dri.freedesktop.org/wiki/R300ToDo">
<li><a href="https://dri.freedesktop.org/wiki/R300ToDo">
<b>r300g</b></a> - Driver for ATI R300 - R500.</li>
<li><a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/i915/TODO">
<b>i915g</b></a> - Driver for Intel i915/i945.</li>
</ul>
<p>

View File

@@ -16,6 +16,180 @@
<h1>News</h1>
<h2>October 19, 2017</h2>
<p>
<a href="relnotes/17.2.3.html">Mesa 17.2.3</a> is released.
This is a bug-fix release.
</p>
<h2>October 2, 2017</h2>
<p>
<a href="relnotes/17.2.2.html">Mesa 17.2.2</a> is released.
This is a bug-fix release.
</p>
<h2>September 25, 2017</h2>
<p>
<a href="relnotes/17.1.10.html">Mesa 17.1.10</a> is released.
This is a bug-fix release.
</p>
<h2>September 17, 2017</h2>
<p>
<a href="relnotes/17.2.1.html">Mesa 17.2.1</a> is released.
This is a bug-fix release.
</p>
<h2>September 8, 2017</h2>
<p>
<a href="relnotes/17.1.9.html">Mesa 17.1.9</a> is released.
This is a bug-fix release.
</p>
<h2>September 4, 2017</h2>
<p>
<a href="relnotes/17.2.0.html">Mesa 17.2.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>August 28, 2017</h2>
<p>
<a href="relnotes/17.1.8.html">Mesa 17.1.8</a> is released.
This is a bug-fix release.
</p>
<h2>August 21, 2017</h2>
<p>
<a href="relnotes/17.1.7.html">Mesa 17.1.7</a> is released.
This is a bug-fix release.
</p>
<h2>August 7, 2017</h2>
<p>
<a href="relnotes/17.1.6.html">Mesa 17.1.6</a> is released.
This is a bug-fix release.
</p>
<h2>July 14, 2017</h2>
<p>
<a href="relnotes/17.1.5.html">Mesa 17.1.5</a> is released.
This is a bug-fix release.
</p>
<h2>June 30, 2017</h2>
<p>
<a href="relnotes/17.1.4.html">Mesa 17.1.4</a> is released.
This is a bug-fix release.
</p>
<h2>June 19, 2017</h2>
<p>
<a href="relnotes/17.1.3.html">Mesa 17.1.3</a> is released.
This is a bug-fix release.
</p>
<h2>June 5, 2017</h2>
<p>
<a href="relnotes/17.1.2.html">Mesa 17.1.2</a> is released.
This is a bug-fix release.
</p>
<h2>June 1, 2017</h2>
<p>
<a href="relnotes/17.0.7.html">Mesa 17.0.7</a> is released.
This is a bug-fix release.
<br>
NOTE: It is anticipated that 17.0.7 will be the final release in the 17.0
series. Users of 17.0 are encouraged to migrate to the 17.1 series in order
to obtain future fixes.
</p>
<h2>May 25, 2017</h2>
<p>
<a href="relnotes/17.1.1.html">Mesa 17.1.1</a> is released.
This is a bug-fix release.
</p>
<h2>May 12, 2017</h2>
<p>
<a href="relnotes/17.0.6.html">Mesa 17.0.6</a> is released.
This is a bug-fix release.
</p>
<h2>May 10, 2017</h2>
<p>
<a href="relnotes/17.1.0.html">Mesa 17.1.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>April 28, 2017</h2>
<p>
<a href="relnotes/17.0.5.html">Mesa 17.0.5</a> is released.
This is a bug-fix release.
</p>
<h2>April 17, 2017</h2>
<p>
<a href="relnotes/17.0.4.html">Mesa 17.0.4</a> is released.
This is a bug-fix release.
</p>
<h2>April 1, 2017</h2>
<p>
<a href="relnotes/17.0.3.html">Mesa 17.0.3</a> is released.
This is a bug-fix release.
</p>
<h2>March 20, 2017</h2>
<p>
<a href="relnotes/13.0.6.html">Mesa 13.0.6</a> and
<a href="relnotes/17.0.2.html">Mesa 17.0.2</a> are released.
These are bug-fix releases from the 13.0 and 17.0 branches, respectively.
<br>
NOTE: It is anticipated that 13.0.6 will be the final release in the 13.0
series. Users of 13.0 are encouraged to migrate to the 17.0 series in order
to obtain future fixes.
</p>
<h2>March 4, 2017</h2>
<p>
<a href="relnotes/17.0.1.html">Mesa 17.0.1</a> is released.
This is a bug-fix release.
</p>
<h2>February 20, 2017</h2>
<p>
<a href="relnotes/13.0.5.html">Mesa 13.0.5</a> is released.
This is a bug-fix release.
</p>
<h2>February 13, 2017</h2>
<p>
<a href="relnotes/17.0.0.html">Mesa 17.0.0</a> is released. This is a
new development release. See the release notes for more information
about the release.
</p>
<h2>February 1, 2017</h2>
<p>
<a href="relnotes/13.0.4.html">Mesa 13.0.4</a> is released.
This is a bug-fix release.
</p>
<h2>January 23, 2017</h2>
<p>
<a href="relnotes/12.0.6.html">Mesa 12.0.6</a> is released.
This is a bug-fix release.
<br>
NOTE: This is an extra release for the 12.0 stable branch, as per developers'
feedback. It is anticipated that 12.0.6 will be the final release in the 12.0
series. Users of 12.0 are encouraged to migrate to the 13.0 series in order
to obtain future fixes.
</p>
<h2>January 5, 2017</h2>
<p>
<a href="relnotes/13.0.3.html">Mesa 13.0.3</a> is released.
@@ -151,7 +325,7 @@ This is a bug-fix release.
</p>
<p>
Mesa demos 8.3.0 is also released.
See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.
See the <a href="https://lists.freedesktop.org/archives/mesa-announce/2015-December/000191.html">announcement</a> for more information about the release.
You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/">ftp.freedesktop.org/pub/mesa/demos/8.3.0/</a>.
</p>
@@ -466,7 +640,7 @@ This is a bug-fix release.
<p>
Mesa demos 8.2.0 is released.
See the <a href="http://lists.freedesktop.org/archives/mesa-announce/2014-July/000100.html">announcement</a> for more information about the release.
See the <a href="https://lists.freedesktop.org/archives/mesa-announce/2014-July/000100.html">announcement</a> for more information about the release.
You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.2.0/">ftp.freedesktop.org/pub/mesa/demos/8.2.0/</a>.
</p>
@@ -645,7 +819,7 @@ This is a bug fix release.
<p>
Mesa demos 8.1.0 is released.
See the <a href="http://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.
See the <a href="https://lists.freedesktop.org/archives/mesa-dev/2013-February/035180.html">announcement</a> for more information about the release.
You can download it from <a href="ftp://ftp.freedesktop.org/pub/mesa/demos/8.1.0/">ftp.freedesktop.org/pub/mesa/demos/8.1.0/</a>.
</p>
@@ -1341,7 +1515,7 @@ and primarily just incorporates bug fixes.
<h2>December 28, 2003</h2>
<p>
The Mesa CVS server has been moved to <a href="http://www.freedesktop.org">
The Mesa CVS server has been moved to <a href="https://www.freedesktop.org">
freedesktop.org</a> because of problems with SourceForge's anonymous
CVS service.
</p>
@@ -1913,7 +2087,7 @@ Here's what's new:</p>
</pre>
<h2>March 23, 2000</h2>
<p>I've just upload the Mesa 3.2 beta 1 files to SourceForge at <a href="http://sourceforge.net/project/showfiles.php?group_id=3">http://sourceforge.net/project/filelist.php?group_id=3</a></p>
<p>I've just upload the Mesa 3.2 beta 1 files to SourceForge at <a href="https://sourceforge.net/project/showfiles.php?group_id=3">https://sourceforge.net/project/filelist.php?group_id=3</a></p>
<p>3.2 (note even number) is a stabilization release of Mesa 3.1 meaning it's mainly
just bug fixes.</p>
<p>Here's what's changed:</p>
@@ -1961,7 +2135,7 @@ After 3.2 is wrapped up I hope to release 3.3 beta 1 soon afterward.</p>
<h2>December 17, 1999</h2>
<p>A Slashdot interview with Brian about Mesa (questions submitted by Slashdot readers)
can be found at <a href="http://slashdot.org/interviews/99/12/17/0927212.shtml">http://slashdot.org/interviews/99/12/17/0927212.shtml</a>.</p>
can be found at <a href="https://slashdot.org/interviews/99/12/17/0927212.shtml">https://slashdot.org/interviews/99/12/17/0927212.shtml</a>.</p>
<h2>December 14, 1999</h2>
<p>Mesa 3.1 is released!</p>
@@ -1995,7 +2169,7 @@ BOF meeting is now available.</p>
<p>-Brian</p>
<h2>August 14, 1999</h2>
<p><a href="http://www.mesa3d.org">www.mesa3d.org</a> is having
<p><a href="https://www.mesa3d.org">www.mesa3d.org</a> is having
technical problems due to hardware failures at VA Linux systems. The Mac pages,
ftp, and CVS services aren't fully restored yet. Please be patient.</p>
<p>-Brian</p>
@@ -2004,9 +2178,9 @@ ftp, and CVS services aren't fully restored yet. Please be patient.</p>
<p>RPMS of the nVidia RIVA server can be found at <code>ftp://ftp.mesa3d.org/mesa/misc/nVidia/</code>.</p>
<h2>June 2, 1999</h2>
<p><a href="http://www.nvidia.com/">nVidia</a> has released some Linux binaries for
<p><a href="https://www.nvidia.com/">nVidia</a> has released some Linux binaries for
xfree86 3.3.3.1, along with the <b>full source</b>, which includes GLX acceleration
based on Mesa 3.0. They can be downloaded from <code>http://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</code>.</p>
based on Mesa 3.0. They can be downloaded from <code>https://www.nvidia.com/Products.nsf/htmlmedia/software_drivers.html</code>.</p>
<h2>May 24, 1999</h2>
<p>Beta 2 of Mesa 3.1 has been make available at <code>ftp://ftp.mesa3d.org/mesa/beta/</code>.
@@ -2054,11 +2228,11 @@ grateful.
<p>The new webpages are now online. Enjoy, and let me know if you find any errors.
<h2>February 16, 1999</h2>
<p><a href="http://www.sgi.com/">SGI</a> releases its
<a href="http://www.sgi.com/software/opensource/glx/">GLX source code</a>.</p>
<p><a href="https://www.sgi.com/">SGI</a> releases its
<a href="https://www.sgi.com/software/opensource/glx/">GLX source code</a>.</p>
<h2>January 22, 1999</h2>
<p><a href="http://www.mesa3d.org">www.mesa3d.org</a> established</p>
<p><a href="https://www.mesa3d.org">www.mesa3d.org</a> established</p>
</div>
</body>

View File

@@ -71,7 +71,7 @@ you think you've spotted a bug let developers know by filing a
<ul>
<li><a href="http://www.python.org/">Python</a> - Python is required.
<li><a href="https://www.python.org/">Python</a> - Python is required.
Version 2.6.4 or later should work.
</li>
<li><a href="http://www.makotemplates.org/">Python Mako module</a> -
@@ -178,7 +178,7 @@ your experience might vary.
<p>
In order to achieve that one should update their local manifest to point to the
upstream repo, set the approapriate BOARD_GPU_DRIVERS and build the
upstream repo, set the appropriate BOARD_GPU_DRIVERS and build the
libGLES_mesa library.
</p>

View File

@@ -17,22 +17,34 @@
<h1>Introduction</h1>
<p>
Mesa is an open-source implementation of the
<a href="http://www.opengl.org/">OpenGL</a> specification -
The Mesa project began as an open-source implementation of the
<a href="https://www.opengl.org/">OpenGL</a> specification -
a system for rendering interactive 3D graphics.
</p>
<p>
A variety of device drivers allows Mesa to be used in many different
environments ranging from software emulation to complete hardware acceleration
for modern GPUs.
Over the years the project has grown to implement more graphics APIs,
including
<a href="https://www.khronos.org/opengles/">OpenGL ES</a> (versions 1, 2, 3),
<a href="https://www.khronos.org/opencl/">OpenCL</a>,
<a href="https://www.khronos.org/openmax/">OpenMAX</a>,
<a href="https://en.wikipedia.org/wiki/VDPAU">VDPAU</a>,
<a href="https://en.wikipedia.org/wiki/Video_Acceleration_API">VA API</a>,
<a href="https://en.wikipedia.org/wiki/X-Video_Motion_Compensation">XvMC</a> and
<a href="https://www.khronos.org/vulkan/">Vulkan</a>.
</p>
<p>
Mesa ties into several other open-source projects: the
<a href="http://dri.freedesktop.org/">Direct Rendering
Infrastructure</a> and <a href="http://x.org">X.org</a> to
provide OpenGL support to users of X on Linux, FreeBSD and other operating
A variety of device drivers allows the Mesa libraries to be used in many
different environments ranging from software emulation to complete hardware
acceleration for modern GPUs.
</p>
<p>
Mesa ties into several other open-source projects: the
<a href="https://dri.freedesktop.org/">Direct Rendering
Infrastructure</a> and <a href="https://x.org">X.org</a> to
provide OpenGL support on Linux, FreeBSD and other operating
systems.
</p>
@@ -85,7 +97,7 @@ the OpenGL API, so they didn't feel threatened by the project.
1995-1996: I continue working on Mesa both during my spare time and during
my work hours at the Space Science and Engineering Center at the University
of Wisconsin in Madison. My supervisor, Bill Hibbard, lets me do this because
Mesa is now being using for the <a href="http://www.ssec.wisc.edu/%7Ebillh/vis.html">Vis5D</a> project.
Mesa is now being using for the <a href="https://www.ssec.wisc.edu/%7Ebillh/vis.html">Vis5D</a> project.
</p><p>
October 1996: Mesa 2.0 is released. It implements the OpenGL 1.1 specification.
</p>
@@ -142,7 +154,7 @@ and OpenGL Shading Language.
<p>
2008: Keith Whitwell and other Tungsten Graphics employees develop
<a href="http://en.wikipedia.org/wiki/Gallium3D">Gallium</a>
<a href="https://en.wikipedia.org/wiki/Gallium3D">Gallium</a>
- a new GPU abstraction layer. The latest Mesa drivers are based on
Gallium and other APIs such as OpenVG are implemented on top of Gallium.
</p>
@@ -153,13 +165,22 @@ and version 1.30 of the OpenGL Shading Language.
</p>
<p>
Ongoing: Mesa is the OpenGL implementation for several types of hardware
made by Intel, AMD and NVIDIA, plus the VMware virtual GPU.
July 2016: Mesa 12.0 is released, including OpenGL 4.3 support and initial
support for Vulkan for Intel GPUs. Plus, there's another gallium software
driver ("swr") based on LLVM and developed by Intel.
</p>
<p>
Ongoing: Mesa is the OpenGL implementation for devices designed by
Intel, AMD, NVIDIA, Qualcomm, Broadcom, Vivante, plus the VMware and
VirGL virtual GPUs.
There's also several software-based renderers: swrast (the legacy
Mesa rasterizer), softpipe (a gallium reference driver) and llvmpipe
(LLVM/JIT-based high-speed rasterizer).
Mesa rasterizer), softpipe (a gallium reference driver), llvmpipe
(LLVM/JIT-based high-speed rasterizer) and swr (another LLVM-based driver).
</p>
<p>
Work continues on the drivers and core Mesa to implement newer versions
of the OpenGL specification.
of the OpenGL, OpenGL ES and Vulkan specifications.
</p>
@@ -178,6 +199,9 @@ of the OpenGL specification is implemented.
Version 12.x of Mesa implements the OpenGL 4.3 API, but not all drivers
support OpenGL 4.3.
</p>
<p>
Initial support for Vulkan is also included.
</p>
<h2>Version 11.x features</h2>
@@ -259,7 +283,7 @@ GL_SRC2_ALPHA GL_SOURCE2_ALPHA
</pre>
<p>
See the
<a href="http://www.opengl.org/documentation/spec.html">
<a href="https://www.opengl.org/documentation/spec.html">
OpenGL specification</a> for more details.
</p>

View File

@@ -59,7 +59,7 @@ to learn if it is thread safe.
Indirect Rendering
You can force indirect rendering mode by setting the LIBGL_ALWAYS_INDIRECT
environment variable. Hardware acceleration will not be used.
environment variable to `true`. Hardware acceleration will not be used.

View File

@@ -18,10 +18,10 @@
<p>
Mesa is a 3-D graphics library with an API which is very similar to
that of <a href="http://www.opengl.org/">OpenGL</a>.*
that of <a href="https://www.opengl.org/">OpenGL</a>.*
To the extent that Mesa utilizes the OpenGL command syntax or state
machine, it is being used with authorization from <a
href="http://www.sgi.com/">Silicon Graphics,
href="https://www.sgi.com/">Silicon Graphics,
Inc.</a>(SGI). However, the author does not possess an OpenGL license
from SGI, and makes no claim that Mesa is in any way a compatible
replacement for OpenGL or associated with SGI. Those who want a
@@ -36,7 +36,7 @@ library</em>. <br>
</p>
<p>
* OpenGL is a trademark of <a href="http://www.sgi.com/"
* OpenGL is a trademark of <a href="https://www.sgi.com/"
>Silicon Graphics Incorporated</a>.
</p>

View File

@@ -21,23 +21,23 @@
</p>
<ul>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-users">mesa-users</a>
<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-users">mesa-users</a>
- intended for end-users of Mesa and DRI drivers. Newbie questions are OK,
but please try the general OpenGL resources and Mesa/DRI documentation first.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
- for Mesa, Gallium and DRI development
discussion. Not for beginners.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-commit">mesa-commit</a>
<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-commit">mesa-commit</a>
- relays git check-in messages (for developers).
In general, people should not post to this list.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/mesa-announce">mesa-announce</a>
<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/mesa-announce">mesa-announce</a>
- announcements of new Mesa
versions are sent to this list. Very low traffic.</p>
</li>
<li><p><a href="http://lists.freedesktop.org/mailman/listinfo/piglit">piglit</a>
<li><p><a href="https://lists.freedesktop.org/mailman/listinfo/piglit">piglit</a>
- for Piglit (OpenGL driver testing framework) discussion.</p>
</li>
</ul>
@@ -56,22 +56,22 @@ Follow the links above for list archives.
<p>
The old Mesa lists hosted at SourceForge are no longer in use.
The archives are still available, however:
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce">mesa3d-announce</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users">mesa3d-users</a>,
<a href="http://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev">mesa3d-dev</a>.
<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-announce">mesa3d-announce</a>,
<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-users">mesa3d-users</a>,
<a href="https://sourceforge.net/mailarchive/forum.php?forum_name=mesa3d-dev">mesa3d-dev</a>.
</p>
<p>For mailing lists about Direct Rendering Modules (drm) in Linux/BSD
kernels, see the
<a href="http://dri.freedesktop.org/wiki/MailingLists">DRI wiki</a>.
<a href="https://dri.freedesktop.org/wiki/MailingLists">DRI wiki</a>.
</p>
<h1>IRC</h1>
<p>join <a href="irc://chat.freenode.net#dri-devel">#dri-devel channel</a>
on <a href="http://webchat.freenode.net/">irc.freenode.net</a>
on <a href="https://webchat.freenode.net/">irc.freenode.net</a>
</p>
@@ -82,7 +82,7 @@ Here are some other OpenGL-related forums you might find useful:
</p>
<ul>
<li><a href="http://www.opengl.org/cgi-bin/ubb/ultimatebb.cgi">OpenGL discussion forums</a>
<li><a href="https://www.opengl.org/discussion_boards/">OpenGL discussion forums</a>
at www.opengl.org</li>
<li>Usenet newsgroups:
<ul>

View File

@@ -34,7 +34,7 @@ It's the fastest software rasterizer for Mesa.
<li>
<p>An x86 or amd64 processor; 64-bit mode recommended.</p>
<p>
Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will
Support for SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will
yield the most efficient code. The fewer features the CPU has the more
likely is that you run into underperforming, buggy, or incomplete code.
</p>
@@ -165,8 +165,8 @@ any OpenGL drivers):
<li><p>load this registry settings:</p>
<pre>REGEDIT4
; http://technet.microsoft.com/en-us/library/cc749368.aspx
; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
; https://technet.microsoft.com/en-us/library/cc749368.aspx
; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]
"DLL"="mesadrv.dll"
"DriverVersion"=dword:00000001
@@ -195,7 +195,7 @@ that no tail call optimizations are done by gcc.
<h2>Linux perf integration</h2>
<p>
On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
On Linux, it is possible to have symbol resolution of JIT code with <a href="https://perf.wiki.kernel.org/">Linux perf</a>:
</p>
<pre>
@@ -206,12 +206,12 @@ On Linux, it is possible to have symbol resolution of JIT code with <a href="htt
<p>
When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
which can be used by the bin/perf-annotate-jit script to produce disassembly of
which can be used by the bin/perf-annotate-jit.py script to produce disassembly of
the generated code annotated with the samples.
</p>
<p>You can obtain a call graph via
<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
<a href="https://github.com/jrfonseca/gprof2dot#linux-perf">Gprof2Dot</a>.</p>
<h1>Unit testing</h1>
@@ -253,7 +253,7 @@ for posterior analysis, e.g.:
We use LLVM-C bindings for now. They are not documented, but follow the C++
interfaces very closely, and appear to be complete enough for code
generation. See
<a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
<a href="https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
this stand-alone example</a>. See the llvm-c/Core.h file for reference.
</li>
</ul>
@@ -264,18 +264,18 @@ for posterior analysis, e.g.:
<li>
<p>Rasterization</p>
<ul>
<li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
<li><a href="https://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
<li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>
<li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>
<li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>
<li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
<li><a href="https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
</ul>
</li>
<li>
<p>Texture sampling</p>
<ul>
<li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>
<li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
<li><a href="https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
<li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>
<li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>
<li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>
@@ -294,21 +294,21 @@ for posterior analysis, e.g.:
<li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>
<li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>
<li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>
<li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
<li><a href="https://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
</ul>
</li>
<li>
<p>LLVM</p>
<ul>
<li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>
<li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
<li><a href="https://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
</ul>
</li>
<li>
<p>General</p>
<ul>
<li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
<li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
<li><a href="https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
<li><a href="https://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
</ul>
</li>
</ul>

View File

@@ -17,8 +17,8 @@
<h1>OpenGL ES</h1>
<p>Mesa implements OpenGL ES 1.1 and OpenGL ES 2.0. More information about
OpenGL ES can be found at <a href="http://www.khronos.org/opengles/">
http://www.khronos.org/opengles/</a>.</p>
OpenGL ES can be found at <a href="https://www.khronos.org/opengles/">
https://www.khronos.org/opengles/</a>.</p>
<p>OpenGL ES depends on a working EGL implementation. Please refer to
<a href="egl.html">Mesa EGL</a> for more information about EGL.</p>

View File

@@ -27,5 +27,5 @@ ARB_texture_float:
enable this extension.
[1] http://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
[2] http://www.opengl.org/registry/specs/ARB/texture_float.txt
[1] https://www.google.com/patents/about?id=mIIOAAAAEBAJ&dq=6650327
[2] https://www.opengl.org/registry/specs/ARB/texture_float.txt

View File

@@ -45,7 +45,7 @@ Multiple filters can be used together.
<li>pp_nored, pp_nogreen, pp_noblue - set to 1 to remove the corresponding color channel.
These are basic filters for easy testing of the PP queue.
<li>pp_jimenezmlaa, pp_jimenezmlaa_color -
<a href="http://www.iryokufx.com/mlaa/" target=_blank>Jimenez's MLAA</a>
<a href="https://www.iryokufx.com/mlaa/" target=_blank>Jimenez's MLAA</a>
is a morphological antialiasing filter.
The two versions use depth and color data, respectively.
Which works better depends on the app - depth will not blur text, but it will

View File

@@ -20,8 +20,14 @@
In general, precompiled Mesa libraries are not available.
</p>
<p>
However, some Linux distros (such as Ubuntu) seem to closely track
Mesa and often have the latest Mesa release available as an update.
Some Linux distributions closely follow the latest Mesa releases. On others one
has to use unofficial channels.
<br>
There are some general directions:
<li>Debian/Ubuntu based distros - PPA: xorg-edgers, oibaf and padoka</li>
<li>Fedora - Corp: erp and che</li>
<li>OpenSuse/SLES - OBS: X11:XOrg and pontostroy:X11</li>
<li>Gentoo/Archlinux - officially provided/supported</li>
</p>
</div>

113
docs/release-calendar.html Normal file
View File

@@ -0,0 +1,113 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Release calendar</title>
<link rel="stylesheet" type="text/css" href="mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Overview</h1>
<p>
Mesa provides feature/development and stable releases.
</p>
<p>
The table below lists the date and release manager that is expected to do the
specific release.
<br>
Take a look <a href="submittingpatches.html#criteria" target="_parent">here</a>
if you'd like to nominate a patch in the next stable release.
</p>
<h1 id="calendar">Calendar</h1>
<table border="1">
<tr>
<th>Branch</th>
<th>Expected date</th>
<th>Release</th>
<th>Release manager</th>
<th>Notes</th>
</tr>
<tr>
<td rowspan="4">17.2</td>
<td>2017-10-27</td>
<td>17.2.4</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-11-10</td>
<td>17.2.5</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-11-24</td>
<td>17.2.6</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-12-08</td>
<td>17.2.7</td>
<td>Emil Velikov</td>
<td>Final planned release for the 17.2 series</td>
</tr>
<tr>
<td rowspan="7">17.3</td>
<td>2017-10-20</td>
<td>17.3.0-rc1</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-10-27</td>
<td>17.3.0-rc2</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-11-03</td>
<td>17.3.0-rc3</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-11-10</td>
<td>17.3.0-rc4</td>
<td>Emil Velikov</td>
<td>May be promoted to 17.3.0 final</td>
</tr>
<tr>
<td>2017-11-24</td>
<td>17.3.1</td>
<td>Andres Gomez</td>
<td></td>
</tr>
<tr>
<td>2017-12-08</td>
<td>17.3.2</td>
<td>Emil Velikov</td>
<td></td>
</tr>
<tr>
<td>2017-12-22</td>
<td>17.3.3</td>
<td>Emil Velikov</td>
<td></td>
</tr>
</table>
</div>
</body>
</html>

View File

@@ -14,6 +14,7 @@
<iframe src="contents.html"></iframe>
<div class="content">
<h1>Releasing process</h1>
<ul>
@@ -28,6 +29,7 @@
<li><a href="#bugzilla">Update Bugzilla</a>
</ul>
<h1 id="overview">Overview</h1>
<p>
@@ -48,20 +50,24 @@ For example:
Mesa 12.0.2 - 12.0 branch, bugfix
</pre>
<h1 id="schedule">Release schedule</h1>
<p>
Releases should happen on Fridays. Delays can occur although those should be keep
to a minimum.
<br>
See our <a href="release-calendar.html" target="_parent">calendar</a> for the
date and other details for individual releases.
</p>
<h2>Feature releases</h2>
<ul>
<li>Available approximatelly every three months.
<li>Available approximately every three months.
<li>Initial timeplan available 2-4 weeks before the planned branchpoint (rc1)
on the mesa-announce@ mailing list.
<li>A <a href="#prerelease">pre-release</a> announcement should be available
approximatelly 24 hours before the final (non-rc) release.
approximately 24 hours before the final (non-rc) release.
</ul>
<h2>Stable releases</h2>
@@ -69,7 +75,7 @@ approximatelly 24 hours before the final (non-rc) release.
<li>Normally available once every two weeks.
<li>Only the latest branch has releases. See note below.
<li>A <a href="#prerelease">pre-release</a> announcement should be available
approximatelly 48 hours before the actual release.
approximately 48 hours before the actual release.
</ul>
<p>
@@ -79,15 +85,24 @@ The final release from the 12.0 series Mesa 12.0.5 will be out around the same
time (or shortly after) 13.0.1 is out.
</p>
<h1 id="pickntest">Cherry-picking and testing</h1>
<p>
Commits nominated for the active branch are picked as based on the
<a href="submittingpatches.html#criteria" target="_parent">criteria</a> as
described in the same section.
</p>
<p>
Maintainer is responsible for testing in various possible permutations of
Nomination happens in the mesa-stable@ mailing list. However,
maintainer is resposible of checking for forgotten candidates in the
master branch. This is achieved by a combination of ad-hoc scripts and
a casual search for terms such as regression, fix, broken and similar.
</p>
<p>
Maintainer is also responsible for testing in various possible permutations of
the autoconf and scons build.
</p>
@@ -101,36 +116,99 @@ release. This is made <strong>only</strong> with explicit permission/request,
and the patch <strong>must</strong> be very well contained. Thus it cannot
affect more than one driver/subsystem.
</p>
<p>
Currently Ilia Mirkin and AMD devs have requested "permanent" exception.
</p>
<ul>
<li>make distcheck, scons and scons check must pass
<li>Testing with different version of system components - LLVM and others is also
performed where possible.
<li>As a general rule, testing with various combinations of configure
switches, depending on the specific patchset.
</ul>
<p>
Achieved by combination of local ad-hoc scripts and AppVeyor plus Travis-CI,
the latter as part of their Github integration.
Achieved by combination of local ad-hoc scripts, mingw-w64 cross
compilation and AppVeyor plus Travis-CI, the latter as part of their
Github integration.
</p>
<p>
For Windows related changes, the main contact point is Brian
Paul. Jose Fonseca can also help as a fallback contact.
</p>
<p>
For Android related changes, the main contact is Tapani
P&auml;lli. Mauro Rossi is collaborating with android-x86 and may
provide feedback about the build status in that project.
</p>
<p>
For MacOSX related changes, Jeremy Huddleston Sequoia is currently a
good contact point.
</p>
<p>
<strong>Note:</strong> If a patch in the current queue needs any additional
fix(es), then they should be squashed together.
<br>
The commit messages and the <code>cherry picked from</code> tags must be preserved.
</p>
<p>
This should be noted in the <a href="#prerelease">pre-announce</a> email.
</p>
<pre>
git show b10859ec41d09c57663a258f43fe57c12332698e
commit b10859ec41d09c57663a258f43fe57c12332698e
Author: Jonas Pfeil &lt;pfeiljonas@gmx.de&gt;
Date: Wed Mar 1 18:11:10 2017 +0100
ralloc: Make sure ralloc() allocations match malloc()'s alignment.
The header of ralloc needs to be aligned, because the compiler assumes
...
(cherry picked from commit cd2b55e536dc806f9358f71db438dd9c246cdb14)
Squashed with commit:
ralloc: don't leave out the alignment factor
Experimentation shows that without alignment factor gcc and clang choose
...
(cherry picked from commit ff494fe999510ea40e3ed5827e7818550b6de126)
</pre>
<h2>Regression/functionality testing</h2>
<p>
Less often (once or twice), shortly before the pre-release announcement.
Ensure that testing is redone if Intel devs have requested an exception, as per above.
</p>
<ul>
<li><em>no regressions should be observed for Piglit/dEQP/CTS/Vulkan on Intel platforms</em>
<li><em>no regressions should be observed for Piglit using the swrast, softpipe
and llvmpipe drivers</em>
</ul>
<p>
Currently testing is performed courtesy of the Intel OTC team and their Jenkins CI setup. Check with the Intel team over IRC how to get things setup.
</p>
<p>
Installing the built driver from the pre-announced RC branch in the
system and making some every day's use until the release may be a good
idea too.
</p>
<h1 id="branch">Making a branchpoint</h1>
@@ -158,6 +236,11 @@ To setup the branchpoint:
git checkout master # make sure we're in master first
git tag -s X.Y-branchpoint -m "Mesa X.Y branchpoint"
git checkout -b X.Y
git checkout master
$EDITOR VERSION # bump the version number
git commit -as
cp docs/relnotes/{X.Y,X.Y+1}.html # copy/create relnotes template
git commit -as
git push origin X.Y-branchpoint X.Y
</pre>
@@ -165,15 +248,18 @@ To setup the branchpoint:
Now go to
<a href="https://bugs.freedesktop.org/editversions.cgi?action=add&amp;product=Mesa" target="_parent">Bugzilla</a> and add the new Mesa version X.Y.
</p>
<p>
Check for rare that there are no distribution breaking changes and revert them
if needed. Extremely rare - we had only one case so far (see
commit 2ced8eb136528914e1bf4e000dea06a9d53c7e04).
Check that there are no distribution breaking changes and revert them if needed.
For example: files being overwritten on install, etc. Happens extremely rarely -
we had only one case so far (see commit 2ced8eb136528914e1bf4e000dea06a9d53c7e04).
</p>
<p>
Proceed to <a href="#release">release</a> -rc1.
</p>
<h1 id="prerelease">Pre-release announcement</h1>
<p>
@@ -187,18 +273,22 @@ release is made.
</p>
<h2>Terminology used</h2>
<ul><li>Nominated</ul>
<p>
Patch that is nominated but yet to to merged in the patch queue/branch.
</p>
<ul><li>Queued</ul>
<p>
Patch is in the queue/branch and will feature in the next release.
Barring reported regressions or objections from developers.
</p>
<ul><li>Rejected</ul>
<p>
Patch does not fit the
<a href="submittingpatches.html#criteria" target="_parent">criteria</a> and
@@ -285,6 +375,12 @@ Queued (NUMBER)
AUTHOR (NUMBER):
COMMIT SUMMARY
For example:
Jonas Pfeil (1):
ralloc: Make sure ralloc() allocations match malloc()'s alignment.
Squashed with
ralloc: don't leave out the alignment factor
Rejected (NUMBER)
=================
@@ -298,6 +394,7 @@ AUTHOR (NUMBER):
Reason: ...
</pre>
<h1 id="release">Making a new release</h1>
<p>
@@ -305,18 +402,21 @@ These are the instructions for making a new Mesa release.
</p>
<h3>Get latest source files</h3>
<p>
Ensure the latest code is available - both in your local master and the
relevant branch.
</p>
<h3>Perform basic testing</h3>
<p>
Most of the testing should already be done during the
<a href="#pickntest">cherry-pick</a> and
<a href="#prerelease">pre-announce</a> stages.
So we do a quick 'touch test'
</p>
<ul>
<li>make distcheck (you can omit this if you're not using --dist below)
<li>scons (from release tarball)
@@ -328,6 +428,7 @@ Here is one solution that I've been using.
</p>
<pre>
# Set MAKEFLAGS if you haven't already
git clean -fXd; git clean -nxd
read # quick cross check any outstanding files
export __version=`cat VERSION`
@@ -336,45 +437,66 @@ Here is one solution that I've been using.
chmod 755 -fR $__build_root; rm -rf $__build_root
mkdir -p $__build_root &amp;&amp; cd $__build_root
$__mesa_root/autogen.sh --enable-llvm-shared-libs &amp;&amp; make -j2 distcheck
# For the native builds - such as distcheck, scons, sanity test, you
# may want to specify which LLVM to use:
# export LLVM_CONFIG=/usr/lib/llvm-3.9/bin/llvm-config
# Build check the tarballs (scons)
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version &amp;&amp; scons &amp;&amp; cd ..
# Do a full distcheck
$__mesa_root/autogen.sh &amp;&amp; make distcheck
# Build check the tarballs (scons, linux)
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version
scons
cd .. &amp;&amp; rm -rf mesa-$__version
# Build check the tarballs (scons, windows/mingw)
# Temporary drop LLVM_CONFIG, unless you have a Windows/mingw one.
# save_LLVM_CONFIG=`echo $LLVM_CONFIG`; unset LLVM_CONFIG
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version
scons platform=windows toolchain=crossmingw
cd .. &amp;&amp; rm -rf mesa-$__version
# Test the automake binaries
rm -rf cd mesa-$__version
tar -xaf mesa-$__version.tar.xz &amp;&amp; cd mesa-$__version
# Restore LLVM_CONFIG, if applicable:
# export LLVM_CONFIG=`echo $save_LLVM_CONFIG`; unset save_LLVM_CONFIG
./configure \
--with-dri-drivers=i965,swrast \
--with-gallium-drivers=swrast \
--with-vulkan-drivers=intel \
--enable-llvm-shared-libs \
--enable-gallium-llvm \
--enable-llvm \
--enable-glx-tls \
--enable-gbm \
--enable-egl \
--with-egl-platforms=x11,drm,wayland
make -j2 &amp;&amp; DESTDIR=`pwd`/test make -j6 install
export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/
--with-platforms=x11,drm,wayland,surfaceless
make &amp;&amp; DESTDIR=`pwd`/test make install
__glxinfo_cmd='glxinfo 2>&amp;1 | egrep -o "Mesa.*|Gallium.*|.*dri\.so"'
__glxgears_cmd='glxgears 2>&amp;1 | grep -v "configuration file"'
__es2info_cmd='es2_info 2>&amp;1 | egrep "GL_VERSION|GL_RENDERER|.*dri\.so"'
__es2gears_cmd='es2gears_x11 2>&amp;1 | grep -v "configuration file"'
test "x$LD_LIBRARY_PATH" != 'x' &amp;&amp; __old_ld="$LD_LIBRARY_PATH"
export LD_LIBRARY_PATH=`pwd`/test/usr/local/lib/:"${__old_ld}"
export LIBGL_DRIVERS_PATH=`pwd`/test/usr/local/lib/dri/
export LIBGL_DEBUG=verbose
glxinfo | egrep -o "Mesa.*"
glxgears
es2_info | egrep "GL_VERSION|GL_RENDERER"
es2gears_x11
export LIBGL_ALWAYS_SOFTWARE=1
glxinfo | egrep -o "Mesa.*|Gallium.*"
glxgears
es2_info | egrep "GL_VERSION|GL_RENDERER"
es2gears_x11
export LIBGL_ALWAYS_SOFTWARE=1
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
export LIBGL_ALWAYS_SOFTWARE=true
export GALLIUM_DRIVER=softpipe
glxinfo | egrep -o "Mesa.*|Gallium.*"
glxgears
es2_info | egrep "GL_VERSION|GL_RENDERER"
es2gears_x11
eval $__glxinfo_cmd
eval $__glxgears_cmd
eval $__es2info_cmd
eval $__es2gears_cmd
# Smoke test DOTA2
unset LD_LIBRARY_PATH
test "x$__old_ld" != 'x' &amp;&amp; export LD_LIBRARY_PATH="$__old_ld" &amp;&amp; unset __old_ld
unset LIBGL_DRIVERS_PATH
unset LIBGL_DEBUG
unset LIBGL_ALWAYS_SOFTWARE
@@ -399,6 +521,7 @@ be empty (TBD) at this point.
<p>
Two scripts are available to help generate portions of the release notes:
</p>
<pre>
./bin/bugzilla_mesa.sh
@@ -415,18 +538,21 @@ to be included in the release notes.
<p>
Commit these changes and push the branch.
</p>
<pre>
git push origin HEAD
</pre>
<h3>Use the release.sh script from xorg util-macros</h3>
<h3>Use the release.sh script from xorg <a href="https://cgit.freedesktop.org/xorg/util/modular/">util-modular</a></h3>
<p>
Ensure that the mesa git tree is clean via <code>git clean -fXd</code> and
start the release process.
Start the release process.
</p>
<pre>
# For the dist/distcheck, you may want to specify which LLVM to use:
# export LLVM_CONFIG=/usr/lib/llvm-3.9/bin/llvm-config
../relative/path/to/release.sh . # append --dist if you've already done distcheck above
</pre>
@@ -438,7 +564,7 @@ and SSH passphrase(s) to sign and upload the files, respectively.
<h3>Add the sha256sums to the release notes</h3>
<p>
Edit docs/relnotes/X.Y.Z.html to add the sha256sums as availabe in the mesa-X.Y.Z.announce template. Commit this change.
Edit docs/relnotes/X.Y.Z.html to add the sha256sums as available in the mesa-X.Y.Z.announce template. Commit this change.
</p>
<h3>Back on mesa master, add the new release notes into the tree</h3>
@@ -453,17 +579,19 @@ Something like the following steps will do the trick:
</pre>
<p>
Also, edit docs/relnotes.html to add a link to the new release notes, and edit
docs/index.html to add a news entry. Then commit and push:
Also, edit docs/relnotes.html to add a link to the new release notes,
edit docs/index.html to add a news entry, and remove the version from
docs/release-calendar.html. Then commit and push:
</p>
<pre>
git commit -as -m "docs: add news item and link release notes for X.Y.Z"
git commit -as -m "docs: update calendar, add news item and link release notes for X.Y.Z"
git push origin master X.Y
</pre>
<h1 id="announce">Announce the release</h1>
<p>
Use the generated template during the releasing process.
</p>
@@ -472,20 +600,8 @@ Use the generated template during the releasing process.
<h1 id="website">Update the mesa3d.org website</h1>
<p>
NOTE: The recent release managers have not been performing this step
themselves, but leaving this to Brian Paul, (who has access to the
sourceforge.net hosting for mesa3d.org). Brian is more than willing to grant
the permission necessary to future release managers to do this step on their
own.
</p>
<p>
Update the web site by copying the docs/ directory's files to
/home/users/b/br/brianp/mesa-www/htdocs/ with:
<br>
<code>
sftp USERNAME,mesa3d@web.sourceforge.net
</code>
As the hosting was moved to freedesktop, git hooks are deployed to update the
website. Manually check that it is updated 5-10 minutes after the final <code>git push</code>
</p>

View File

@@ -21,6 +21,33 @@ The release notes summarize what's new or changed in each Mesa release.
</p>
<ul>
<li><a href="relnotes/17.2.3.html">17.2.3 release notes</a>
<li><a href="relnotes/17.2.2.html">17.2.2 release notes</a>
<li><a href="relnotes/17.1.10.html">17.1.10 release notes</a>
<li><a href="relnotes/17.2.1.html">17.2.1 release notes</a>
<li><a href="relnotes/17.1.9.html">17.1.9 release notes</a>
<li><a href="relnotes/17.2.0.html">17.2.0 release notes</a>
<li><a href="relnotes/17.1.8.html">17.1.8 release notes</a>
<li><a href="relnotes/17.1.7.html">17.1.7 release notes</a>
<li><a href="relnotes/17.1.6.html">17.1.6 release notes</a>
<li><a href="relnotes/17.1.5.html">17.1.5 release notes</a>
<li><a href="relnotes/17.1.4.html">17.1.4 release notes</a>
<li><a href="relnotes/17.1.3.html">17.1.3 release notes</a>
<li><a href="relnotes/17.1.2.html">17.1.2 release notes</a>
<li><a href="relnotes/17.0.7.html">17.0.7 release notes</a>
<li><a href="relnotes/17.1.1.html">17.1.1 release notes</a>
<li><a href="relnotes/17.0.6.html">17.0.6 release notes</a>
<li><a href="relnotes/17.1.0.html">17.1.0 release notes</a>
<li><a href="relnotes/17.0.5.html">17.0.5 release notes</a>
<li><a href="relnotes/17.0.4.html">17.0.4 release notes</a>
<li><a href="relnotes/17.0.3.html">17.0.3 release notes</a>
<li><a href="relnotes/17.0.2.html">17.0.2 release notes</a>
<li><a href="relnotes/13.0.6.html">13.0.6 release notes</a>
<li><a href="relnotes/17.0.1.html">17.0.1 release notes</a>
<li><a href="relnotes/13.0.5.html">13.0.5 release notes</a>
<li><a href="relnotes/17.0.0.html">17.0.0 release notes</a>
<li><a href="relnotes/13.0.4.html">13.0.4 release notes</a>
<li><a href="relnotes/12.0.6.html">12.0.6 release notes</a>
<li><a href="relnotes/13.0.3.html">13.0.3 release notes</a>
<li><a href="relnotes/12.0.5.html">12.0.5 release notes</a>
<li><a href="relnotes/13.0.2.html">13.0.2 release notes</a>

148
docs/relnotes/12.0.6.html Normal file
View File

@@ -0,0 +1,148 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 12.0.6 Release Notes / January 23, 2017</h1>
<p>
Mesa 12.0.6 is a bug fix release which fixes bugs found since the 12.0.5 release.
</p>
<p>
Mesa 12.0.6 implements the OpenGL 4.3 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.3. OpenGL
4.3 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
65339ba5d76a45225b8b56f9a1da9db15c569e1d163760faa2921da0a8461741 mesa-12.0.6.tar.gz
7d6da9744c1022a4c2ab6ad01a206984d00443fb691568011d01b3dd97e36448 mesa-12.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<p>This list is likely incomplete.</p>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95130">Bug 95130</a> - Derivatives of gl_Color wrong when helper pixels used</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
</ul>
<h2>Changes</h2>
<p>Chad Versace (3):</p>
<ul>
<li>i965/mt: Disable aux surfaces after making miptree shareable</li>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
<li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 12.0.5</li>
<li>get-typod-pick-list.sh: add new script</li>
<li>automake: use shared llvm libs for make distcheck</li>
<li>egl/wayland: use the destroy_window_callback for swrast</li>
<li>Update version to 12.0.6</li>
</ul>
<p>Fredrik Höglund (1):</p>
<ul>
<li>dri3: Fix MakeCurrent without a default framebuffer</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: take extra push space into account for pushbuf_space calls</li>
</ul>
<p>Jason Ekstrand (19):</p>
<ul>
<li>spirv/nir: Fix some texture opcode asserts</li>
<li>spirv/nir: Add support for shadow samplers that return vec4</li>
<li>spirv/nir: Properly handle gather components</li>
<li>anv/pipeline: Set binding_table.gather_texture_start</li>
<li>nir: Add a helper for determining the type of a texture source</li>
<li>nir/lower_tex: Add some helpers for working with tex sources</li>
<li>nir/lower_tex: Add support for lowering coordinate offsets</li>
<li>i965/nir: Enable NIR lowering of txf and rect offsets</li>
<li>i965: Get rid of the do_lower_unnormalized_offsets pass</li>
<li>spirv/nir: Don't increment coord_components for array lod queries</li>
<li>anv/image: Assert that the image format is actually supported</li>
<li>spirv/nir: Move opcode selection higher up in handle_texture</li>
<li>spirv/nir: Refactor type handling in handle_texture</li>
<li>nir/spirv: Refactor coordinate handling in handle_texture</li>
<li>spirv/nir: Handle texture projectors</li>
<li>spirv/nir: Add support for ImageQuerySamples</li>
<li>anv/device: Return the right error for failed maps</li>
<li>anv/device: Implicitly unmap memory objects in FreeMemory</li>
<li>anv/descriptor_set: Write the state offset in the surface state free list.</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>
<li>i965: Properly flush in hsw_pause_transform_feedback().</li>
</ul>
<p>Marek Olšák (6):</p>
<ul>
<li>cso: don't release sampler states that are bound</li>
<li>radeonsi: always restore sampler states when unbinding sampler views</li>
<li>radeonsi: fix incorrect FMASK checking in bind_sampler_states</li>
<li>radeonsi: disable CE on SI + AMDGPU</li>
<li>radeonsi: disable the constant engine (CE) on Carrizo and Stoney</li>
<li>gallium/radeon: fix the draw-calls HUD query</li>
</ul>
<p>Matt Turner (3):</p>
<ul>
<li>i965/fs: Rename opt_copy_propagate -&gt; opt_copy_propagation.</li>
<li>i965/fs: Add unit tests for copy propagation pass.</li>
<li>i965/fs: Reject copy propagation into SEL if not min/max.</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: enable WQM in PS prolog when needed</li>
</ul>
</div>
</body>
</html>

255
docs/relnotes/13.0.4.html Normal file
View File

@@ -0,0 +1,255 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.4 Release Notes / February 1, 2017</h1>
<p>
Mesa 13.0.4 is a bug fix release which fixes bugs found since the 13.0.3 release.
</p>
<p>
Mesa 13.0.4 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a78518030b0b7d77a6c426ac3ff40f4b27fb0e2cdb0dfbe685024a46cae59bad mesa-13.0.4.tar.gz
a95d7ce8f7bd5f88585e4be3144a341236d8c0fc91f6feaec59bb8ba3120e726 mesa-13.0.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99085">Bug 99085</a> - [EGL] dEQP-EGL.functional.sharing.gles2.multithread intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99097">Bug 99097</a> - [vulkancts] dEQP-VK.image.store regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99100">Bug 99100</a> - [SKL,BDW,BSW,KBL] dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99185">Bug 99185</a> - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99188">Bug 99188</a> - dEQP-EGL.functional.create_context_ext.robust_gl_30.rgb565_no_depth_no_stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99210">Bug 99210</a> - ES3-CTS.functional.texture.mipmap.cube.generate.rgba5551_*</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>
</ul>
<h2>Changes</h2>
<p>Andres Rodriguez (2):</p>
<ul>
<li>vulkan/wsi: clarify the severity of lack of DRI3 v2</li>
<li>radv: fix include order for installed headers v2</li>
</ul>
<p>Arda Coskunses (2):</p>
<ul>
<li>vulkan/wsi/x11: don't crash on null visual</li>
<li>vulkan/wsi/x11: don't crash on null wsi x11 connection</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Support loader interface version 3.</li>
</ul>
<p>Chad Versace (10):</p>
<ul>
<li>egl: Check config's surface types in eglCreate*Surface()</li>
<li>dri: Add __DRI_IMAGE_FORMAT_ARGB1555</li>
<li>mesa/texformat: Handle GL_RGBA + GL_UNSIGNED_SHORT_5_5_5_1</li>
<li>egl: Emit correct error when robust context creation fails</li>
<li>anv: Handle vkGetPhysicalDeviceQueueFamilyProperties with count == 0</li>
<li>mesa/shaderobj: Fix races on refcounts</li>
<li>meta: Disable dithering during glGenerateMipmap</li>
<li>vulkan: Add new cast macros for VkIcd types</li>
<li>vulkan: Update vk_icd.h to interface version 3</li>
<li>anv: Support loader interface version 3 (patch v2)</li>
</ul>
<p>Christian König (1):</p>
<ul>
<li>vl/zscan: fix "Fix trivial sign compare warnings"</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>glx: Add missing glproto dependency for gallium-xlib glx</li>
</ul>
<p>Damien Grassart (1):</p>
<ul>
<li>anv: return count of queue families written</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: flush smem for uniform buffer bit.</li>
</ul>
<p>Emil Velikov (10):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.3</li>
<li>cherry-ignore: add couple of intel_miptree_copy related patches</li>
<li>cherry-ignore: add radv: Call nir_lower_constant_initializers."</li>
<li>get-typod-pick-list.sh: add new script</li>
<li>cherry-ignore: add "_mesa_ClampColor extension/version fix"</li>
<li>cherry-ignore: add wayland race condition fix</li>
<li>egl/wayland: use the destroy_window_callback for swrast</li>
<li>automake: use shared llvm libs for make distcheck</li>
<li>get-pick-list.sh: Require explicit "13.0" for nominating stable patches</li>
<li>Update version to 13.0.4</li>
</ul>
<p>Francisco Jerez (1):</p>
<ul>
<li>anv: Fix uniform and storage buffer offset alignment limits.</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>radv: fix dual source blending</li>
<li>dri3: Fix MakeCurrent without a default framebuffer</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>mapi: update the asm code to support x32</li>
</ul>
<p>Heiko Przybyl (1):</p>
<ul>
<li>r600/sb: Fix loop optimization related hangs on eg</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nouveau: take extra push space into account for pushbuf_space calls</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/generator/tex: Handle an immediate sampler with an indirect texture</li>
<li>anv/formats: Use the real format for B4G4R4A4_UNORM_PACK16 on gen8</li>
<li>nir/search: Only allow matching SSA values</li>
<li>isl: Mark A4B4G4R4_UNORM as supported on gen8</li>
</ul>
<p>Jonas Ådahl (1):</p>
<ul>
<li>egl/wayland: Cleanup private display connection when init fails</li>
</ul>
<p>Kenneth Graunke (7):</p>
<ul>
<li>i965: Don't bail on vertex element processing if we need draw params.</li>
<li>i965: Fix last slot calculations</li>
<li>i965: Fix texturing in the vec4 TCS and GS backends.</li>
<li>spirv: Move cursor before calling vtn_ssa_value() in phi 2nd pass.</li>
<li>i965: Make BLORP disable the NP Z PMA stall fix.</li>
<li>glsl: Use ir_var_temporary when generating inline functions.</li>
<li>i965: Properly flush in hsw_pause_transform_feedback().</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>vdpau: call texture_get_handle while the mutex is being held</li>
<li>va: call texture_get_handle while the mutex is being held</li>
<li>radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+</li>
<li>radeonsi: don't forget to add HTILE to the buffer list for texturing</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>cso: Don't restore nr_samplers in cso_restore_fragment_samplers</li>
</ul>
<p>Nanley Chery (3):</p>
<ul>
<li>anv/cmd_buffer: Fix arrayed depth/stencil attachments</li>
<li>anv/cmd_buffer: Fix programmed HiZ qpitch</li>
<li>anv/image: Disable HiZ for depth buffer arrays</li>
</ul>
<p>Nayan Deshmukh (1):</p>
<ul>
<li>st/va: delay calling begin_frame until we have all parameters</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: some fence cleanup</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>gallium/hud: add missing break in hud_cpufreq_graph_install()</li>
</ul>
<p>Timothy Arceri (3):</p>
<ul>
<li>nir: Turn imov/fmov of undef into undef</li>
<li>glsl: fix opt_minmax redundancy checks against baserange</li>
<li>util: fix list_is_singular()</li>
</ul>
<p>Zachary Michaels (1):</p>
<ul>
<li>radeonsi: Always leave poly_offset in a valid state</li>
</ul>
</div>
</body>
</html>

210
docs/relnotes/13.0.5.html Normal file
View File

@@ -0,0 +1,210 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.5 Release Notes / February 20, 2017</h1>
<p>
Mesa 13.0.5 is a bug fix release which fixes bugs found since the 13.0.4 release.
</p>
<p>
Mesa 13.0.5 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7e45e3812078726eabca6d9384364bf035a3c4279024ec9090dd1b19a8989926 mesa-13.0.5.tar.gz
bfcea7e2c801525a60895c8aff11aa68457ee9aa35d01a4638e1f310a3f5ef87 mesa-13.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name drmDevicePtr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98526">Bug 98526</a> - glsl/tests/general-ir-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: const struct API_STATE has no member named linkageCount</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>
</ul>
<h2>Changes</h2>
<p>Bartosz Tomczyk (2):</p>
<ul>
<li>r600: Fix stack overflow</li>
<li>r600/sb: Fix memory leak</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: [rasterizer core] Remove dead code Clipper::ClipScalar()</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/mt: Disable HiZ when sharing depth buffer externally (v2)</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv: change base aligmment for allocated memory.</li>
<li>radv: fix cik macroModeIndex.</li>
<li>radv: adopt some init config workarounds from radeonsi.</li>
</ul>
<p>Derek Foreman (1):</p>
<ul>
<li>egl/dri2: add image_loader_extension back into loader extensions for wayland</li>
</ul>
<p>Emil Velikov (26):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.4</li>
<li>configure.ac: list radeon in --with-vulkan-drivers help string</li>
<li>i965: automake: correctly set MKDIR_GEN</li>
<li>freedreno: automake: correctly set MKDIR_GEN</li>
<li>i965: automake: include builddir prior to srcdir</li>
<li>i915: automake: include builddir prior to srcdir</li>
<li>egl: automake: include builddir prior to srcdir</li>
<li>clover: automake: include builddir prior to srcdir</li>
<li>st/dri: automake: include builddir prior to srcdir</li>
<li>d3dadapter9: automake: include builddir prior to srcdir</li>
<li>glx: automake: include builddir prior to srcdir</li>
<li>glx/apple: automake: include builddir prior to srcdir</li>
<li>glx/windows: automake: include builddir prior to srcdir</li>
<li>loader: automake: include builddir prior to srcdir</li>
<li>mapi: automake: include builddir prior to srcdir</li>
<li>radeon, r200: automake: include builddir prior to srcdir</li>
<li>dri/swrast: automake: include builddir prior to srcdir</li>
<li>dri/osmesa: automake: include builddir prior to srcdir</li>
<li>mesa/tests: automake: include builddir prior to srcdir</li>
<li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>
<li>bin/get-extra-pick-list: rework to use already_picked list</li>
<li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>
<li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>
<li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>
<li>bin/get-fixes-pick-list.sh: add new script</li>
<li>Update version to 13.0.5</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glx/glvnd: Fix GLXdispatchIndex sorting</li>
</ul>
<p>Ian Romanick (11):</p>
<ul>
<li>linker: Slight code rearrange to prevent duplication in the next commit</li>
<li>linker: Accurately track gl_uniform_block::stageref</li>
<li>glsl: Split process_block_array into two functions</li>
<li>glsl: Fix wonkey indentation left from previous commit</li>
<li>glsl: Track the linearized array index for each UBO instance array element</li>
<li>glsl: Use simpler visitor to determine which UBO and SSBO blocks are used</li>
<li>glsl: Add tracking for elements of an array-of-arrays that have been accessed</li>
<li>glsl: Add structures to track accessed elements of a single array</li>
<li>glsl: Mark a set of array elements as accessed using a list of array_deref_range</li>
<li>glsl: Walk a list of ir_dereference_array to mark array elements as accessed</li>
<li>linker: Accurately mark a uniform block instance array element as used in a stage</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>vbo: process buffer binding state changes on draw when recording</li>
<li>st/mesa: MAX_VARYING is the max supported number of patch varyings, not min</li>
<li>nvc0: disable linked tsc mode in compute launch descriptor</li>
</ul>
<p>Jason Ekstrand (11):</p>
<ul>
<li>nir/search: Use the correct bit size for integer comparisons</li>
<li>i965/blorp: Use the correct ISL format for combined depth/stencil</li>
<li>intel/blorp: Handle clearing of A4B4G4R4 on all platforms</li>
<li>isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell</li>
<li>anv: Flush render cache before STATE_BASE_ADDRESS on gen7</li>
<li>anv: Improve flushing around STATE_BASE_ADDRESS</li>
<li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats</li>
<li>vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes</li>
<li>vulkan/wsi: Lower the maximum image sizes</li>
<li>i965/sampler_state: Pass texObj into update_sampler_state</li>
<li>i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Unbind deleted shaders from brw_context, fixing malloc heisenbug.</li>
</ul>
<p>Lionel Landwerlin (5):</p>
<ul>
<li>anv: don't require render target isl bit for depth/stencil surfaces</li>
<li>anv: set command buffer to NULL when allocations fail</li>
<li>anv: fix descriptor pool internal size allocation</li>
<li>spirv: handle OpUndef as part of the variable parsing pass</li>
<li>spirv: handle undefined components for OpVectorShuffle</li>
</ul>
<p>Marc-André Lureau (1):</p>
<ul>
<li>tgsi-dump: dump label if instruction has one</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: always set the TCL1_ACTION_ENA when invalidating L2</li>
<li>gallium/radeon: fix performance of buffer readbacks</li>
</ul>
<p>Topi Pohjolainen (2):</p>
<ul>
<li>i965: Make depth clear flushing more explicit</li>
<li>i965/gen6: Issue direct depth stall and flush after depth clear</li>
</ul>
<p>Vinson Lee (2):</p>
<ul>
<li>scons: Require libdrm &gt;= 2.4.66 for DRM.</li>
<li>util: Fix Clang trivial destructor check.</li>
</ul>
</div>
</body>
</html>

287
docs/relnotes/13.0.6.html Normal file
View File

@@ -0,0 +1,287 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 13.0.6 Release Notes / March 20, 2017</h1>
<p>
Mesa 13.0.6 is a bug fix release which fixes bugs found since the 13.0.5 release.
</p>
<p>
Mesa 13.0.6 implements the OpenGL 4.4 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.4. OpenGL
4.4 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
1076590f29103f022a2cd87e6dff6ae77072013745603d06b0410c373ab2bb1a mesa-13.0.6.tar.gz
29ef104a7fc082d352b1599bd6cb1d040be424ccd22f5e0eb7ee9b0e9acd3597 mesa-13.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (2):</p>
<ul>
<li>radv: Emit pending flushes before executing a secondary command buffer</li>
<li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>
</ul>
<p>Bartosz Tomczyk (1):</p>
<ul>
<li>glsl: fix heap-buffer-overflow</li>
</ul>
<p>Bas Nieuwenhuizen (8):</p>
<ul>
<li>radv: Pass CMASK alignment to application.</li>
<li>radv: Pass DCC alignment to application.</li>
<li>radv: Never try to create more than max_sets descriptor sets.</li>
<li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>
<li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>
<li>radv: Use correct size for availability flag.</li>
<li>radv: Disable HTILE for textures with multiple layers/levels.</li>
<li>radv: Emit cache flushes before CP DMA.</li>
</ul>
<p>Ben Crocker (3):</p>
<ul>
<li>gallivm: Improve debug output (V2)</li>
<li>gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)</li>
<li>gallivm: Reenable PPC VSX (v3)</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>egl/dri3: implement query surface hook</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: Prune empty nodes in CalculateProcessorTopology.</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>anv: fix Get*MemoryRequirements for !LLC</li>
</ul>
<p>Dave Airlie (13):</p>
<ul>
<li>radv: program a default point size.</li>
<li>radv: handle transfer_write as a dst flag.</li>
<li>radv/ac: handle nir irem opcode.</li>
<li>radv/ac: implement txs for buffer textures.</li>
<li>radv/ac: correctly size shared memory usage.</li>
<li>radv/ac: avoid the fmask path when doing txs.</li>
<li>radv: pass FMASK alignment to application</li>
<li>tgsi: fix memory leak in tgsi sanity check</li>
<li>radv: fix depth format in blit2d.</li>
<li>radv: fix txs for sampler buffers</li>
<li>radv: drop Z24 support.</li>
<li>radv: disable mip point pre clamping.</li>
<li>radv: setup llvm target data layout</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: add sha256 checksums for 13.0.5</li>
<li>Revert "get-pick-list.sh: Require explicit "13.0" for nominating stable patches"</li>
<li>cherry-ignore: don't pick nir_op_pack_double optimisation fix</li>
<li>i965: move brw_define.h ifndef guard to the top</li>
<li>cherry-ignore: add ANV fast clears related fixes</li>
<li>Update version to 13.0.6</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>radv: fix the dynamic buffer index in vkCmdBindDescriptorSets</li>
<li>radv/ac: fix multiple descriptor sets with dynamic buffers</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr: Align query results allocation</li>
</ul>
<p>Grazvydas Ignotas (3):</p>
<ul>
<li>r300g: only allow byteswapped formats on big endian</li>
<li>gallium/u_queue: fix a crash with atexit handlers</li>
<li>gallium/u_queue: set num_threads correctly if not all threads start</li>
</ul>
<p>Gregory Hainaut (1):</p>
<ul>
<li>glapi: fix typo in count_scale</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>mesa: Don't advertise GL_OES_read_format in core profile</li>
</ul>
<p>Ilia Mirkin (8):</p>
<ul>
<li>nvc0: increase number of ubo binding points</li>
<li>nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute</li>
<li>nvc0/ir: fix ubo max clamp, reset file index</li>
<li>gm107/ir: fix address offset bitfield for ATOMS</li>
<li>nvc0: set the render condition in the compute object</li>
<li>st/mesa: don't pass compare mode for stencil-sampled textures</li>
<li>nvc0: take extra pushbuf space into account for pushbuf_space calls</li>
<li>nvc0: increase alignment to 256 for texture buffers on fermi</li>
</ul>
<p>Jacob Lifshay (1):</p>
<ul>
<li>vulkan/wsi: Improve the DRI3 error message</li>
</ul>
<p>Jason Ekstrand (11):</p>
<ul>
<li>i965: Use a better guardband calculation.</li>
<li>intel/blorp: Swizzle clear colors on the CPU</li>
<li>i965/fs: Remove the inline pack_double_2x32 optimization</li>
<li>anv: Add an invalidate_range helper</li>
<li>anv/query: clflush the bo map on non-LLC platforms</li>
<li>genxml: Make MI_STORE_DATA_IMM more consistent</li>
<li>anv/query: Perform CmdResetQueryPool on the GPU</li>
<li>blorp/exec: Use uint32_t for copying varying data</li>
<li>intel/blorp: Explicitly flush all allocated state</li>
<li>anv: Accurately advertise dynamic descriptor limits</li>
<li>anv: Properly handle destroying NULL devices and instances</li>
</ul>
<p>Jonas Pfeil (1):</p>
<ul>
<li>ralloc: Make sure ralloc() allocations match malloc()'s alignment.</li>
</ul>
<p>Jose Maria Casanova Crespo (1):</p>
<ul>
<li>glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1</li>
</ul>
<p>Kenneth Graunke (7):</p>
<ul>
<li>i965: Fix fast depth clears for surfaces with a dimension of 16384.</li>
<li>i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.</li>
<li>i965: Fix check for negative pitch in can_do_fast_copy_blit().</li>
<li>i965: Support the force_glsl_version driconf option.</li>
<li>i965: Combine the Gen6 SF and Clip viewport atoms.</li>
<li>mesa: Do (TCS &amp;&amp; !TES) draw time validation in ES as well.</li>
<li>egl: Ensure ResetNotificationStrategy matches for shared contexts.</li>
</ul>
<p>Lionel Landwerlin (3):</p>
<ul>
<li>spirv: don't assert with location decorations on non i/o variables</li>
<li>anv: wsi: report presentation error per image request</li>
<li>i965/fs: fix uninitialized memory access</li>
</ul>
<p>Marc Di Luzio (1):</p>
<ul>
<li>glsl: correct compute shader checks for memoryBarrier functions</li>
</ul>
<p>Marek Olšák (10):</p>
<ul>
<li>st/mesa: destroy pipe_context before destroying st_context (v2)</li>
<li>radeonsi: don't invoke DCC decompression in update_all_texture_descriptors</li>
<li>radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)</li>
<li>gallium/util: remove unused u_index_modify helpers</li>
<li>gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally</li>
<li>gallium/u_queue: fix random crashes when the app calls exit()</li>
<li>st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops</li>
<li>st/mesa: set blend state for PBO readbacks</li>
<li>radeonsi: fix broken tessellation on Carrizo and Stoney</li>
<li>radeonsi: mark all bound shader buffer ranges as initialized</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>clover: Work around build failure with AltiVec.</li>
</ul>
<p>Nicolai Hähnle (12):</p>
<ul>
<li>mesa/main: fix meta caller of _mesa_ClampColor</li>
<li>radeonsi: fix texture gather on stencil textures</li>
<li>glsl: split DIV_TO_MUL_RCP into single- and double-precision flags</li>
<li>glx/dri3: handle NULL pointers in loader-to-DRI3 drawable conversion</li>
<li>glx/dri3: guard in_current_context against a disappeared drawable</li>
<li>glx: guard swap-interval functions against destroyed drawables</li>
<li>dri/common: clear the loaderPrivate pointer in driDestroyDrawable</li>
<li>winsys/amdgpu: reduce max_alloc_size based on GTT limits</li>
<li>radeonsi: handle MultiDrawIndirect in si_get_draw_start_count</li>
<li>radeonsi: fix UINT/SINT clamping for 10-bit formats on &lt;= CIK</li>
<li>st/glsl_to_tgsi: avoid iterating past the head of the instruction list</li>
<li>st/mesa: inform the driver of framebuffer changes before compute dispatches</li>
</ul>
<p>Samuel Iglesias Gonsálvez (6):</p>
<ul>
<li>glsl: fix heap-use-after-free in ast_declarator_list::hir()</li>
<li>i965/fs: mark last DF uniform array element as 64 bit live one</li>
<li>i965/fs: detect different bit size accesses to uniforms to push them in proper locations</li>
<li>i965/fs: fix indirect load DF uniforms on BSW/BXT</li>
<li>i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles</li>
<li>i965/fs: emit MOV_INDIRECT with the source with the right register type</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>winsys/amdgpu: avoid potential segfault in amdgpu_bo_map()</li>
</ul>
</div>
</body>
</html>

View File

@@ -14,7 +14,7 @@
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.0 Release Notes / TBD</h1>
<h1>Mesa 17.0.0 Release Notes / February 13, 2017</h1>
<p>
Mesa 17.0.0 is a new development release.
@@ -33,7 +33,8 @@ because compatibility contexts are not supported.
<h2>SHA256 checksums</h2>
<pre>
TBD.
696578f0b83796470511a88a95fff15a2a25fa201a9e487716f2ca20c177c3ab mesa-17.0.0.tar.gz
39db3d59700159add7f977307d12a7dfe016363e760ad82280ac4168ea668481 mesa-17.0.0.tar.xz
</pre>
@@ -62,13 +63,222 @@ Note: some of the new features are only available with certain drivers.
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=70623">Bug 70623</a> - libglx.so: undefined symbol: _glapi_tls_Context</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=72902">Bug 72902</a> - [IVB/HSW/BDW] DOTA2 segfaults unless Mesa is configured with (non-default) --enable-glx-tls</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=73778">Bug 73778</a> - _glapi_tls_Dispatch undefined</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77662">Bug 77662</a> - Fail to render to different faces of depth-stencil cube map</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=89043">Bug 89043</a> - undefined symbol: _glapi_tls_Dispatch</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=91281">Bug 91281</a> - Tonga VCE 2160p encode fails with BO to small for addr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92234">Bug 92234</a> - [BDW] GPU hang in Shogun2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92634">Bug 92634</a> - gallium's vl_mpeg12_decoder does not work with st/va</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92760">Bug 92760</a> - Add FP64 support to the i965 shader backends</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=92925">Bug 92925</a> - Incorrect GEN for ASTC in Surface Format Table</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93551">Bug 93551</a> - Divinity: Original Sin Enhanced Edition(Native) crash on start</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94512">Bug 94512</a> - X segfaults with glx-tls enabled in a x32 environment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94900">Bug 94900</a> - HD6950 GPU lockup loop with various steam games (octodad[always], saints row 4[always], dead island[always], grid autosport[sometimes])</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=94904">Bug 94904</a> - [vulkan, BSW] dEQP-VK.api.object_management.multithreaded_per_thread_device intermittent crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95460">Bug 95460</a> - Please add more drivers (freedreno, virgl) to features.txt status document</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96959">Bug 96959</a> - nop.sat generated by pow workaround?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97232">Bug 97232</a> - Line rendering broken in Dolphin when using gl_ClipDistance</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97287">Bug 97287</a> - GL45-CTS.vertex_attrib_binding.basic-inputL-case1 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97321">Bug 97321</a> - Query INFO_LOG_LENGTH for empty info log should return 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97420">Bug 97420</a> - &quot;#version 0&quot; crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97422">Bug 97422</a> - trying to call a number as a function results into a crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97447">Bug 97447</a> - GL 3.0 compatibility context exposes GL_ARB_compute_shader</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97473">Bug 97473</a> - Memory corruption when uploading DXT5 cubemap faces</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97715">Bug 97715</a> - [ILK,G45,G965] piglit.spec.arb_separate_shader_objects.misc api error checks</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97779">Bug 97779</a> - [regression, bisected][BDW, GPU hang] stuck on render ring, always reproducible</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97804">Bug 97804</a> - Later precision statement isn't overriding earlier one</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97952">Bug 97952</a> - /usr/include/string.h:518:12: error: exception specification in declaration does not match previous declaration</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97967">Bug 97967</a> - glsl/tests/cache-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98005">Bug 98005</a> - VCE dual instance encoding inconsistent since st/va: enable dual instances encode by sync surface</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98012">Bug 98012</a> - [IVB] Segfault when running Dolphin twice with Vulkan</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98134">Bug 98134</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.draw_buffers wants a different GL error code</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98172">Bug 98172</a> - Concurrent call to glClientWaitSync results in segfault in one of the waiters.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98238">Bug 98238</a> - witcher 2: objects are black when changing lod</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98243">Bug 98243</a> - dEQP mismatched UBO precision qualifiers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98245">Bug 98245</a> - GLES3.1 link negative dEQP &quot;expected linking to fail, but passed.&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98250">Bug 98250</a> - dEQP-GLES31.functional.debug.negative_coverage.get_error.texture.texparameterIiv/texparameterIuiv failure</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98297">Bug 98297</a> - Can't configure a desktop with 3x4k monitors in one row</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98299">Bug 98299</a> - Compute shaders generate stupid divides</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98307">Bug 98307</a> - &quot;st/glsl_to_tgsi: explicitly track all input and output declaration&quot; broke flightgear colors on rs780</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98326">Bug 98326</a> - [dEQP, EGL] pbuffer depth/stencil tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98327">Bug 98327</a> - [dEQP, EGL] dEQP-EGL.functional.resize not supported</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98328">Bug 98328</a> - [dEQP, EGL] luminance tests fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98329">Bug 98329</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.image.render_multiple_contexts.gles2_renderbuffer_depth16_depth_buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98330">Bug 98330</a> - [dEQP, EGL] dEQP-EGL.functional.buffer_age.no_preserve fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98339">Bug 98339</a> - dEQP-EGL: Got EGL_BAD_MATCH: eglCreateSyncKHR()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98343">Bug 98343</a> - dEQP-EGL: GL_INVALID_ENUM at teglCreateContextExtTests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98415">Bug 98415</a> - Vulkan Driver JSON file contains incorrect field</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98421">Bug 98421</a> - src/loader/loader.c:111:40: error: unknown type name drmDevicePtr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98431">Bug 98431</a> - UnrealEngine v4 demos startup fails to blorp blit assert</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98480">Bug 98480</a> - Support R8 image texture in ES 3.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98512">Bug 98512</a> - radeon r600 vdpau: Invalid command stream: texture bo too small</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98518">Bug 98518</a> - [r600g, bisected] regression: NI/Turks MSAA texture corruption with FreeCAD and Wine games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98526">Bug 98526</a> - glsl/tests/general-ir-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98595">Bug 98595</a> - glsl: ralloc assertion &quot;info-&gt;canary == CANARY&quot; failed</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98599">Bug 98599</a> - xterm menus corrupt since tgsi/scan: handle indirect image indexing correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98632">Bug 98632</a> - Fix build on Hurd without PATH_MAX</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98681">Bug 98681</a> - ir_builder_print_visitor.cpp:401:67: error: expected ')' before 'PRIx64'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98694">Bug 98694</a> - &quot;(5=2)?1:1&quot; as array size decleration crashes glsl_compiler</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98740">Bug 98740</a> - bitcode.cpp:102:8: error: Error is not a member of llvm</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98767">Bug 98767</a> - [swrast] ralloc.c:84: get_header: Assertion `info-&gt;canary == CANARY' failed.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98774">Bug 98774</a> - glsl/tests/warnings-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98815">Bug 98815</a> - [SKL/BDW GT2] large perf regression in TessMark</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98840">Bug 98840</a> - nir clone test fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98893">Bug 98893</a> - [SKL] piglit.spec.arb_shader_image_load_store.semantics intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98914">Bug 98914</a> - mesa-vdpau-drivers: breaks vdpau for mpeg2video</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98917">Bug 98917</a> - [BDW SKL BSW KBL] Tessellation CTS tests regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99010">Bug 99010</a> - --disable-gallium-llvm no longer recognized</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99013">Bug 99013</a> - [regression, bisected] radeonsi: commit 4c8c13b3 &quot;Use amdgcn intrinsics for fs interpolation&quot; makes system unusable</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99030">Bug 99030</a> - [HSW, regression] transform feedback fails on Linux 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99038">Bug 99038</a> - [dEQP, EGL, SKL, BDW, BSW] dEQP-EGL.functional.negative_api.create_pixmap_surface crashes</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99072">Bug 99072</a> - [byt,ivb,snb] ES3-CTS.gtf.GL3Tests.shadow regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99085">Bug 99085</a> - [EGL] dEQP-EGL.functional.sharing.gles2.multithread intermittent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99097">Bug 99097</a> - [vulkancts] dEQP-VK.image.store regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99100">Bug 99100</a> - [SKL,BDW,BSW,KBL] dEQP-VK.glsl.return.return_in_dynamic_loop_dynamic_vertex regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99119">Bug 99119</a> - swr_fence_work.cpp(42): error: argument of type &quot;std::nullptr_t&quot; is incompatible with parameter of type &quot;unsigned long&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99144">Bug 99144</a> - Incorrect rendering using glDrawArraysInstancedBaseInstance and first != 0 on Skylake</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99154">Bug 99154</a> - Link time error when using multiple builtin functions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99158">Bug 99158</a> - vdpau segfaults and gpu locks with kodi on R9285</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99185">Bug 99185</a> - dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99188">Bug 99188</a> - dEQP-EGL.functional.create_context_ext.robust_gl_30.rgb565_no_depth_no_stencil</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99210">Bug 99210</a> - ES3-CTS.functional.texture.mipmap.cube.generate.rgba5551_*</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99214">Bug 99214</a> - Crash in library libswrAVX.so when assigning vertex buffer object pointers with elements of type GL_DOUBLE</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99219">Bug 99219</a> - The Stanley Parable GPU hang when starting a new game</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99229">Bug 99229</a> - [G33] thousands of tests crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99231">Bug 99231</a> - [HSW][i965] Crash in upload_3dstate_streamout()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99287">Bug 99287</a> - piglit.spec.glsl-1_10.execution.vs-nested-return-sibling-loop regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99303">Bug 99303</a> - [REGRESSION][BISECTED] DMs are crashing on start with &quot;radeon&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99314">Bug 99314</a> - [g33] glsl regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99339">Bug 99339</a> - Blender line rendering broken after removing XY clipping of lines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99354">Bug 99354</a> - [G71] &quot;Assertion `bkref' failed&quot; reproducible with glmark2</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99389">Bug 99389</a> - Mesa build broken: sid_tables.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99391">Bug 99391</a> - [ILK,G45,G965] piglit regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99419">Bug 99419</a> - Crash(Segmentation fault) si_shader_select in Master Of Orion</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: const struct API_STATE has no member named linkageCount</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99637">Bug 99637</a> - VLC video has corrupted colors when using VDPAU output on Radeon SI</li>
</ul>
<h2>Changes</h2>
TBD.
<ul>
<li>Building RADV requires --enable-gallium-llvm</li>
<li>The vulkan headers vk_platform.h and vulkan.h are no longer installed</li>
<li>The configure options --with-sha1 and --disable-shader-cache are
removed alongside their respective library requirements</li>
</ul>
</div>
</body>

221
docs/relnotes/17.0.1.html Normal file
View File

@@ -0,0 +1,221 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.1 Release Notes / March 4, 2017</h1>
<p>
Mesa 17.0.1 is a bug fix release which fixes bugs found since the 17.0.0 release.
</p>
<p>
Mesa 17.0.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
e819bd3e515dac26faf9836d8f27a4ddf05323b9b23afb6c06536d4ac82e2743 mesa-17.0.1.tar.gz
96fd70ef5f31d276a17e424e7e1bb79447ccbbe822b56844213ef932e7ad1b0c mesa-17.0.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Never try to create more than max_sets descriptor sets.</li>
<li>radv: Reset emitted compute pipeline when calling secondary cmd buffer.</li>
<li>radv: Only use PKT3_OCCLUSION_QUERY when it doesn't hang.</li>
<li>radv: Use correct size for availability flag.</li>
</ul>
<p>Ben Crocker (3):</p>
<ul>
<li>gallivm: Reenable PPC VSX (v3)</li>
<li>gallivm: Improve debug output (V2)</li>
<li>gallivm: Override getHostCPUName() "generic" w/ "pwr8" (v4)</li>
</ul>
<p>Brendan King (1):</p>
<ul>
<li>egl/dri3: implement query surface hook</li>
</ul>
<p>Christian Gmeiner (2):</p>
<ul>
<li>etnaviv: move pctx initialisation to avoid a null dereference</li>
<li>etnaviv: remove number of pixel pipes validation</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>anv: fix Get*MemoryRequirements for !LLC</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>egl/wayland: Don't use DRM format codes for SHM</li>
</ul>
<p>Dave Airlie (6):</p>
<ul>
<li>tgsi: fix memory leak in tgsi sanity check</li>
<li>radv: change base aligmment for allocated memory.</li>
<li>radv: fix cik macroModeIndex.</li>
<li>radv: adopt some init config workarounds from radeonsi.</li>
<li>radv: fix depth format in blit2d.</li>
<li>radv: fix txs for sampler buffers</li>
</ul>
<p>Emil Velikov (8):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.0</li>
<li>bin/get-extra-pick-list: use git merge-base to get the branchpoint</li>
<li>bin/get-extra-pick-list: rework to use already_picked list</li>
<li>bin/get-typod-pick-list.sh: limit `git grep ...' to only as needed</li>
<li>bin/get-pick-list.sh: limit `git grep ...' only as needed</li>
<li>bin/get-pick-list.sh: remove ancient way of nominating patches</li>
<li>bin/get-fixes-pick-list.sh: add new script</li>
<li>Update version to 17.0.1</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>vc4: Avoid emitting small immediates for UBO indirect load address guards.</li>
</ul>
<p>Grazvydas Ignotas (3):</p>
<ul>
<li>r300g: only allow byteswapped formats on big endian</li>
<li>gallium/u_queue: fix a crash with atexit handlers</li>
<li>gallium/u_queue: set num_threads correctly if not all threads start</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glx/glvnd: Fix GLXdispatchIndex sorting</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>gm107/ir: fix address offset bitfield for ATOMS</li>
<li>nvc0: set the render condition in the compute object</li>
<li>st/mesa: don't pass compare mode for stencil-sampled textures</li>
<li>nvc0: disable linked tsc mode in compute launch descriptor</li>
</ul>
<p>Jason Ekstrand (10):</p>
<ul>
<li>i965/sampler_state: Clamp min/max LOD to 14 on gen7+</li>
<li>i965/sampler_state: Pass texObj into update_sampler_state</li>
<li>i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge</li>
<li>intel/blorp: Swizzle clear colors on the CPU</li>
<li>i965/fs: Fix the inline nir_op_pack_double optimization</li>
<li>anv: Add an invalidate_range helper</li>
<li>anv/query: clflush the bo map on non-LLC platforms</li>
<li>genxml: Make MI_STORE_DATA_IMM more consistent</li>
<li>anv/query: Perform CmdResetQueryPool on the GPU</li>
<li>intel/blorp: Explicitly flush all allocated state</li>
</ul>
<p>Jose Maria Casanova Crespo (1):</p>
<ul>
<li>glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>mesa: Do (TCS &amp;&amp; !TES) draw time validation in ES as well.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>configure.ac: check require_basic_egl only if egl enabled</li>
</ul>
<p>Lionel Landwerlin (2):</p>
<ul>
<li>anv: wsi: report presentation error per image request</li>
<li>i965/fs: fix uninitialized memory access</li>
</ul>
<p>Marek Olšák (6):</p>
<ul>
<li>radeonsi: fix UNSIGNED_BYTE index buffer fallback with non-zero start (v2)</li>
<li>gallium/util: remove unused u_index_modify helpers</li>
<li>gallium/u_index_modify: don't add PIPE_TRANSFER_UNSYNCHRONIZED unconditionally</li>
<li>gallium/u_queue: fix random crashes when the app calls exit()</li>
<li>radeonsi: fix broken tessellation on Carrizo and Stoney</li>
<li>amd/common: fix ASICREV_IS_POLARIS11_M for Polaris12</li>
</ul>
<p>Mauro Rossi (2):</p>
<ul>
<li>android: radeonsi: fix sid_table.h generated header include path</li>
<li>android: glsl: build shader cache sources</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>configure.ac: Drop LLVM compiler flags more radically</li>
</ul>
<p>Nicolai Hähnle (3):</p>
<ul>
<li>winsys/amdgpu: reduce max_alloc_size based on GTT limits</li>
<li>radeonsi: handle MultiDrawIndirect in si_get_draw_start_count</li>
<li>radeonsi: fix UINT/SINT clamping for 10-bit formats on &lt;= CIK</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>glsl: fix heap-use-after-free in ast_declarator_list::hir()</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>android: fix droid_create_image_from_prime_fd_yuv for YV12</li>
</ul>
</div>
</body>
</html>

185
docs/relnotes/17.0.2.html Normal file
View File

@@ -0,0 +1,185 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.2 Release Notes / March 20, 2017</h1>
<p>
Mesa 17.0.2 is a bug fix release which fixes bugs found since the 17.0.1 release.
</p>
<p>
Mesa 17.0.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
2e0f41e7974ba7a36ca32bbeaf8ebcd65c8fd4d2dc9872f04d4becbd5e7a8cb5 mesa-17.0.2.tar.gz
f8f191f909e01e65de38d5bdea5fb057f21649a3aed20948be02348e77a689d4 mesa-17.0.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97988">Bug 97988</a> - [radeonsi] playing back videos with VDPAU exhibits deinterlacing/anti-aliasing issues not visible with VA-API</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99484">Bug 99484</a> - Crusader Kings 2 - Loading bars, siege bars, morale bars, etc. do not render correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (3):</p>
<ul>
<li>radv: Emit pending flushes before executing a secondary command buffer</li>
<li>radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer</li>
<li>radv/ac: Fix shared memory offset calculation</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Disable HTILE for textures with multiple layers/levels.</li>
<li>radv: Emit cache flushes before CP DMA.</li>
<li>Revert "radv: Emit cache flushes before CP DMA."</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv: drop Z24 support.</li>
<li>radv: disable mip point pre clamping.</li>
<li>radv: setup llvm target data layout</li>
</ul>
<p>Emil Velikov (4):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.1</li>
<li>cherry-ignore: add the swizzle blorp_clear fix</li>
<li>i965: move brw_define.h ifndef guard to the top</li>
<li>Update version to 17.0.2</li>
</ul>
<p>Fredrik Höglund (2):</p>
<ul>
<li>radv: fix the dynamic buffer index in vkCmdBindDescriptorSets</li>
<li>radv/ac: fix multiple descriptor sets with dynamic buffers</li>
</ul>
<p>Gregory Hainaut (1):</p>
<ul>
<li>glapi: fix typo in count_scale</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nvc0: take extra pushbuf space into account for pushbuf_space calls</li>
<li>nvc0: increase alignment to 256 for texture buffers on fermi</li>
</ul>
<p>Jacob Lifshay (1):</p>
<ul>
<li>vulkan/wsi: Improve the DRI3 error message</li>
</ul>
<p>James Legg (1):</p>
<ul>
<li>radv: Fix using more than 4 bound descriptor sets</li>
</ul>
<p>Jason Ekstrand (7):</p>
<ul>
<li>anv/blorp/clear_subpass: Only set surface clear color for fast clears</li>
<li>anv: Accurately advertise dynamic descriptor limits</li>
<li>anv: Stall before fast-clear operations</li>
<li>anv: Properly handle destroying NULL devices and instances</li>
<li>anv/blorp: Turn off AUX after doing a CCS_D resolve</li>
<li>anv/blorp: Only set a clear color for resolves if fast-cleared</li>
<li>nir/intrinsics: Make load_barycentric_input take a 2-component coor</li>
</ul>
<p>Jonas Pfeil (1):</p>
<ul>
<li>ralloc: Make sure ralloc() allocations match malloc()'s alignment.</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>egl: Ensure ResetNotificationStrategy matches for shared contexts.</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: reset sample_mask, min_sample, and render_condition for PBO ops</li>
<li>st/mesa: set blend state for PBO readbacks</li>
<li>radeonsi: mark all bound shader buffer ranges as initialized</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>clover: Work around build failure with AltiVec.</li>
</ul>
<p>Nanley Chery (2):</p>
<ul>
<li>anv/pass: Avoid accessing attachment array out of bounds</li>
<li>anv/image: Remove extra dependency on HiZ-specific variable</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>st/glsl_to_tgsi: avoid iterating past the head of the instruction list</li>
<li>st/mesa: inform the driver of framebuffer changes before compute dispatches</li>
</ul>
<p>Robert Foss (1):</p>
<ul>
<li>mesa: Avoid read of uninitialized variable</li>
</ul>
<p>Samuel Iglesias Gonsálvez (5):</p>
<ul>
<li>i965/fs: mark last DF uniform array element as 64 bit live one</li>
<li>i965/fs: detect different bit size accesses to uniforms to push them in proper locations</li>
<li>i965/fs: fix indirect load DF uniforms on BSW/BXT</li>
<li>i965/fs: fix source type when emitting MOV_INDIRECT to read ICP handles</li>
<li>i965/fs: emit MOV_INDIRECT with the source with the right register type</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radeonsi: disable sinking common instructions down to the end block</li>
</ul>
</div>
</body>
</html>

189
docs/relnotes/17.0.3.html Normal file
View File

@@ -0,0 +1,189 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.3 Release Notes / April 1, 2017</h1>
<p>
Mesa 17.0.3 is a bug fix release which fixes bugs found since the 17.0.2 release.
</p>
<p>
Mesa 17.0.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
8253edf1bdd7b14ab63d5982349143a5c9ac3767f39a63257cc9d7e7d92f60f1 mesa-17.0.3.tar.gz
ca646f5075a002d60ef9123c8a4331cede155c01712ef945a65c59a5e69fe7ed mesa-17.0.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96743">Bug 96743</a> - [BYT, HSW, SKL, BXT, KBL] GPU hangs with GfxBench 4.0 CarChase</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99246">Bug 99246</a> - [d3dadapter+radeonsi &amp; bisect] EVE-Online : hang on wormhole sight</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100061">Bug 100061</a> - LODQ instruction generated with invalid dst mask</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100182">Bug 100182</a> - Flickering in The Talos Principle on Sky Lake GT4.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100201">Bug 100201</a> - Windows scons build with MSVC toolchain and LLVM 4.0 fails</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: add new polaris12 pci id</li>
</ul>
<p>Andres Gomez (5):</p>
<ul>
<li>glsl: on UBO/SSBOs link error reset the number of active blocks to 0</li>
<li>cherry-ignore: add the Invalidate L2 for TRANSFER_WRITE barriers fix</li>
<li>cherry-ignore: add the Flush after unmap in gbm/dri fix</li>
<li>cherry-ignore: corrected typo in the Flush after unmap in gbm/dri fix</li>
<li>Update version to 17.0.3</li>
</ul>
<p>Axel Davy (2):</p>
<ul>
<li>st/nine: Resolve deadlock in surface/volume dtors when using csmt</li>
<li>st/nine: Use atomics for available_texture_mem</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: flush DB cache before and after HTILE decompress.</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: fix primitive reset index emission</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.2</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>st/mesa: set result writemask based on ir type</li>
</ul>
<p>Jan Vesely (1):</p>
<ul>
<li>clover: use pipe_resource references</li>
</ul>
<p>Jason Ekstrand (9):</p>
<ul>
<li>anv/query: Invalidate the correct range</li>
<li>anv/GetQueryPoolResults: Actually implement the spec</li>
<li>anv/image: Return early when unbinding an image</li>
<li>anv/query: Fix the location of timestamp availability</li>
<li>anv: Make anv_get_layerCount a macro</li>
<li>anv/blorp: Use anv_get_layerCount everywhere</li>
<li>anv/cmd_buffer: Apply flush operations prior to executing secondaries</li>
<li>anv/cmd_buffer: Fix bad indentation</li>
<li>anv: Flush caches prior to PIPELINE_SELECT on all gens</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>c11/threads: Include thr/xtimec.h for xtime definition when building with MSVC.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>tests/cache_test: allow crossing mount points</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvc0/ir: treat FMA like MAD for operand propagation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Fall back to GL 4.2/4.3 on Haswell if the kernel isn't new enough.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi: don't hang on shader compile failure</li>
</ul>
<p>Matt Turner (1):</p>
<ul>
<li>i965/fs: Don't emit SEL instructions for type-converting MOVs.</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>intel: Correct the BDW surface state size</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>mesa/main: fix MultiDrawElements[BaseVertex] validation of primcount</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: fix memory leak</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr: [rasterizer jitter] fix llvm &gt;= 5.0 build break</li>
</ul>
<p>Timothy Arceri (2):</p>
<ul>
<li>glsl: fix lower jumps for returns when loop is inside an if</li>
<li>mesa: update lower_jumps tests after bug fix</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>i965/gen8+: Do full stall when switching pipeline</li>
</ul>
<p>Xu Randy (2):</p>
<ul>
<li>anv/blorp: Fix a crash in CmdClearColorImage</li>
<li>anv/genX: Solve the vkCreateGraphicsPipelines crash</li>
</ul>
</div>
</body>
</html>

156
docs/relnotes/17.0.4.html Normal file
View File

@@ -0,0 +1,156 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.4 Release Notes / April 17, 2017</h1>
<p>
Mesa 17.0.4 is a bug fix release which fixes bugs found since the 17.0.3 release.
</p>
<p>
Mesa 17.0.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c4c34ba05d48f76b45bc05bc4b6e9242077f403d63c4f0c355c7b07786de233e mesa-17.0.4.tar.gz
1269dc8545a193932a0779b2db5bce9be4a5f6813b98c38b93b372be8362a346 mesa-17.0.4.tar.xz
</pre>
<h2>Next release</h2>
<p>
Mesa 17.0.5 is expected in approximatelly two weeks. See the release
<a href="../release-calendar.html#calendar" target="_parent">calendar</a>
for details.
</p>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99515">Bug 99515</a> - SIGSEGV MAPERR on Android nougat-x86 with mesa 17.0.0rc</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100391">Bug 100391</a> - SachaWillems deferredmultisampling asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100452">Bug 100452</a> - push_constants host memory leak when resetting command buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100582">Bug 100582</a> - [GEN8+] piglit.spec.arb_stencil_texturing.glblitframebuffer corrupts state.gl_texture* assertions</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: add new polaris10 pci id</li>
</ul>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Invalidate L2 for TRANSFER_WRITE barriers</li>
</ul>
<p>Andres Gomez (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.3</li>
</ul>
<p>Craig Stout (1):</p>
<ul>
<li>anv/cmd_buffer: fix host memory leak</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>Revert "cherry-ignore: add the Flush after unmap in gbm/dri fix"</li>
<li>Revert "freedreno: fix memory leak"</li>
<li>Update version to 17.0.4</li>
</ul>
<p>Fabio Estevam (1):</p>
<ul>
<li>loader: Move non-error message to debug level</li>
</ul>
<p>Ilia Mirkin (4):</p>
<ul>
<li>nvc0/ir: fix LSB/BFE/BFI implementations</li>
<li>nvc0/ir: fix overwriting of offset register with interpolateAtOffset</li>
<li>nvc0: increase texture buffer object alignment to 256 for pre-GM107</li>
<li>nouveau: when mapping a persistent buffer, synchronize on former xfers</li>
</ul>
<p>Jason Ekstrand (5):</p>
<ul>
<li>i965/fs: Always provide a default LOD of 0 for TXS and TXL</li>
<li>anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex</li>
<li>anv/blorp: Align vertex buffers to 64B</li>
<li>i965/blorp: Align vertex buffers to 64B</li>
<li>i965/blorp: Bump the batch space estimate</li>
</ul>
<p>Jerome Duval (2):</p>
<ul>
<li>haiku: build fixes around debug defines</li>
<li>haiku/winsys: fix dt prototype args</li>
</ul>
<p>Julien Isorce (4):</p>
<ul>
<li>winsys/radeon: check null in radeon_cs_create_fence</li>
<li>winsys/radeon: check null return from radeon_cs_create_fence in cs_flush</li>
<li>radeon: initialize hole variable before calling container_of</li>
<li>radeon_drm_bo: explicitly check return value of drmCommandWriteRead</li>
</ul>
<p>Kenneth Graunke (4):</p>
<ul>
<li>i965: Document the sad story of the kernel command parser.</li>
<li>i965: Set screen-&gt;cmd_parser_version to 0 if we can't write registers.</li>
<li>i965: Skip register write detection when possible.</li>
<li>i965: Set kernel features before computing max GL version.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>targets: export radeon winsys_create functions to silence LLVM warning</li>
</ul>
<p>Michal Srb (1):</p>
<ul>
<li>st: Add cubeMapFace parameter to st_finalize_texture.</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>gbm/dri: Flush after unmap</li>
</ul>
</div>
</body>
</html>

144
docs/relnotes/17.0.5.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.5 Release Notes / April 28, 2017</h1>
<p>
Mesa 17.0.5 is a bug fix release which fixes bugs found since the 17.0.4 release.
</p>
<p>
Mesa 17.0.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7510eee0d0077860b250d30d73305048c2df4ba09ea8fc04e4f3eec7beece301 mesa-17.0.5.tar.gz
668efa445d2f57a26e5c096b1965a685733a3b57d9c736f9d6460263847f9bfe mesa-17.0.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97524">Bug 97524</a> - Samplers referring to the same texture unit with different types should raise GL_INVALID_OPERATION</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (16):</p>
<ul>
<li>cherry-ignore: Add the pci_id into the shader cache UUID</li>
<li>cherry-ignore: fix crash if ctx torn down with no rendering</li>
<li>cherry-ignore: Fix typos.</li>
<li>cherry-ignore: Revert "etnaviv: Cannot render to rb-swapped formats"</li>
<li>cherry-ignore: Revert "i965/fs: Don't emit SEL instructions for type-converting MOVs."</li>
<li>cherry-ignore: fix typo in a2b10g10r10 fast clear calculation</li>
<li>cherry-ignore: remove unused anv_dispatch_table dtable</li>
<li>cherry-ignore: remove unused radv_dispatch_table dtable</li>
<li>cherry-ignore: make radv_resolve_entrypoint static</li>
<li>cherry-ignore: vulkan: add support for libmesa_vulkan_util</li>
<li>cherry-ignore: r600: fix libmesa_amd_common dependency</li>
<li>cherry-ignore: remove dead brw_new_shader() declaration</li>
<li>cherry-ignore: remove i965_symbols_test reference from .gitignore</li>
<li>cherry-ignore: automake: ensure that the destination directory is created</li>
<li>cherry-ignore: provide required gem stubs for the tests</li>
<li>Update version to 17.0.5</li>
</ul>
<p>Boyan Ding (2):</p>
<ul>
<li>nvc0/ir: Properly handle a "split form" of predicate destination</li>
<li>nir: Destination component count of shader_clock intrinsic is 2</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.4</li>
<li>winsys/sw/dri: don't use GNU void pointer arithmetic</li>
<li>st/clover: add space between &lt; and ::</li>
<li>configure.ac: check require_basic_egl only if egl enabled</li>
<li>st/mesa: automake: honour the vdpau header install location</li>
</ul>
<p>Francisco Jerez (2):</p>
<ul>
<li>intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.</li>
<li>intel/fs: Take into account amount of data read in spilling cost heuristic.</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: report timestampPeriod correctly</li>
</ul>
<p>Jason Ekstrand (5):</p>
<ul>
<li>anv/blorp: Flush the texture cache in UpdateBuffer</li>
<li>anv/cmd_buffer: Flush the VF cache at the top of all primaries</li>
<li>anv/cmd_buffer: Always set up a null surface state</li>
<li>anv/cmd_buffer: Use the null surface state for ATTACHMENT_UNUSED</li>
<li>anv/blorp: Properly handle VK_ATTACHMENT_UNUSED</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965/vec4: Avoid reswizzling MACH instructions in opt_register_coalesce().</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: invalidate the readpix cache in st_indirect_draw_vbo</li>
</ul>
<p>Nanley Chery (1):</p>
<ul>
<li>anv/cmd_buffer: Disable CCS on BDW input attachments</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>mesa: fix remaining xfb prims check for GLES with multiple instances</li>
<li>mesa: extract need_xfb_remaining_prims_check</li>
<li>mesa: move glMultiDrawArrays to vbo and fix error handling</li>
<li>vbo: fix gl_DrawID handling in glMultiDrawArrays</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>util/queue: don't hang at exit</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>mesa: validate sampler type across the whole program</li>
</ul>
</div>
</body>
</html>

186
docs/relnotes/17.0.6.html Normal file
View File

@@ -0,0 +1,186 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.6 Release Notes / May 12, 2017</h1>
<p>
Mesa 17.0.6 is a bug fix release which fixes bugs found since the 17.0.5 release.
</p>
<p>
Mesa 17.0.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f1b2497d553e9a584f0caa3a2d9d310e27ead15fb0af170da69f6e70fb5031cd mesa-17.0.6.tar.gz
89ecf3bcd0f18dcca5aaa42bf36bb52a2df33be89889f94aaaad91f7a504a69d mesa-17.0.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98428">Bug 98428</a> - Undefined non-weak-symbol in dri-drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100854">Bug 100854</a> - YUV to RGB Color Space Conversion result is not precise</li>
</ul>
<h2>Changes</h2>
<p>Adam Jackson (1):</p>
<ul>
<li>egl/platform/drm: Don't take display ownership until gbm is initialized</li>
</ul>
<p>Andres Gomez (7):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.5</li>
<li>travis: replace Trusty-based LLVM toolchain apt-get with apt addon</li>
<li>travis: add the possibility of using the txc-dxtn library</li>
<li>cherry-ignore: 17.1 nominations only</li>
<li>cherry-ignore: fix regression in descriptor set freeing.</li>
<li>cherry-ignore: rejected commits</li>
<li>Update version to 17.0.6</li>
</ul>
<p>Ben Boeckel (1):</p>
<ul>
<li>scons: update for LLVM 4.0</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>st/mesa: move duplicated st_ws_framebuffer() function into header file</li>
</ul>
<p>Chad Versace (3):</p>
<ul>
<li>egl: Emit error when EGLSurface is lost</li>
<li>egl/android: Cancel any outstanding ANativeBuffer in surface destructor</li>
<li>egl/android: Mark surface as lost when dequeueBuffer fails</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>etnaviv: add L8A8_UNORM texture format</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv/wsi: report presentation error per image request</li>
<li>radv: enable POLARIS12 support.</li>
</ul>
<p>Emil Velikov (21):</p>
<ul>
<li>travis: correct libdrm required regex to also track libdrm itself</li>
<li>travis: add nearly all gallium drivers to the list</li>
<li>travis: use both cores for make/make check</li>
<li>travis: bring the scons build on par with AppVeyor</li>
<li>travis: explicitly LD_LIBRARY_PATH the local libraries</li>
<li>travis: enable apt cache</li>
<li>travis: automatically manage ccache caching</li>
<li>travis: remove unused -dev packages</li>
<li>travis: rework "if test" blocks in the script section</li>
<li>travis: split out matrix from env</li>
<li>travis: add separate "scons" and "scons llvm" targets</li>
<li>travis: add "scons swr" to the build matrix</li>
<li>travis: add "make swr" to the build matrix</li>
<li>travis: split the make target to three separate ones</li>
<li>travis: model scons check target like the make one</li>
<li>travis: add Gallium state-tracker targets</li>
<li>travis: enable wayland support</li>
<li>travis: bump MAKEFLAGS to -j4</li>
<li>gallium/dri: always link against shared glapi</li>
<li>mesa/dri: always link against shared glapi</li>
<li>glx: glX_proto_send.py: use correct compile guard GLX_INDIRECT_RENDERING</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>nir: Pick just the channels we want for bitmap and drawpixels lowering.</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>gallium/targets: fix bool setting on BE architectures</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/cmd_buffer: Use the device allocator for QueueSubmit</li>
</ul>
<p>Johnson Lin (1):</p>
<ul>
<li>nir/lower_tex: Fix minor error in YUV color conversion matrix</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>radeonsi: adjust ESGS ring buffer size computation on VI</li>
<li>radeonsi: apply the tess+GS hang workaround to Polaris12 as well</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: fix gl_PrimitiveID in tessellation with instanced draws on SI</li>
</ul>
<p>Philipp Zabel (3):</p>
<ul>
<li>renderonly: close transfer prime_fd</li>
<li>renderonly: drop resources on destroy</li>
<li>renderonly: use drmIoctl</li>
</ul>
<p>Rhys Kidd (3):</p>
<ul>
<li>travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon</li>
<li>travis: Add radv vulkan driver to continuous integration</li>
<li>travis: Add radeonsi to continuous integration</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno/a3xx: fix hang w/ large render targets and small gmem</li>
</ul>
<p>Samuel Iglesias Gonsálvez (5):</p>
<ul>
<li>i965/vec4: fix vertical stride to avoid breaking region parameter rule</li>
<li>i965/vec4: fix register width for DF VGRF and UNIFORM</li>
<li>i965/vec4: don't modify regioning parameters to the sources of DF align1 instructions</li>
<li>anv: anv_gem_mmap() returns MAP_FAILED as mapping error</li>
<li>anv: vkBindImageMemory() should return VK_ERROR_OUT_OF_{HOST,DEVICE}_MEMORY on failure</li>
</ul>
</div>
</body>
</html>

145
docs/relnotes/17.0.7.html Normal file
View File

@@ -0,0 +1,145 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.0.7 Release Notes / June 1, 2017</h1>
<p>
Mesa 17.0.7 is a bug fix release which fixes bugs found since the 17.0.6 release.
</p>
<p>
Mesa 17.0.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
bc68d13c6b1a053b855ac453ebf7e62bd89511adf44bad6c613e09f7fa13390a mesa-17.0.7.tar.gz
f6d75304a229c8d10443e219d6b6c0c342567dbab5a879ebe7cfa3c9139c4492 mesa-17.0.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98833">Bug 98833</a> - [REGRESSION, bisected] Wayland revert commit breaks non-Vsync fullscreen frame updates</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100741">Bug 100741</a> - Chromium - Memory leak</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100925">Bug 100925</a> - [HSW/BSW/BDW/SKL] Google Earth is not resolving all the details in the map correctly</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.0.6</li>
</ul>
<p>Bartosz Tomczyk (1):</p>
<ul>
<li>mesa: Avoid leaking surface in st_renderbuffer_delete</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>egl: Partially revert 23c86c74, fix eglMakeCurrent</li>
</ul>
<p>Daniel Stone (7):</p>
<ul>
<li>vulkan: Fix Wayland uninitialised registry</li>
<li>vulkan/wsi/wayland: Remove roundtrip when creating image</li>
<li>vulkan/wsi/wayland: Use per-display event queue</li>
<li>vulkan/wsi/wayland: Use proxy wrappers for swapchain</li>
<li>egl/wayland: Don't open-code roundtrip</li>
<li>egl/wayland: Use per-surface event queues</li>
<li>egl/wayland: Ensure we get a back buffer</li>
</ul>
<p>Emil Velikov (5):</p>
<ul>
<li>st/va: fix misplaced closing bracket</li>
<li>anv: automake: list shared libraries after the static ones</li>
<li>radv: automake: list shared libraries after the static ones</li>
<li>egl/wayland: select the format based on the interface used</li>
<li>Update version to 17.0.7</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>renderonly: Initialize fields of struct winsys_handle.</li>
<li>vc4: Don't allocate new BOs to avoid synchronization when they're shared.</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glxglvnddispatch: Add missing dispatch for GetDriverConfig</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nvc0/ir: SHLADD's middle source must be an immediate</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>i965/blorp: Do and end-of-pipe sync on both sides of fast-clear ops</li>
<li>i965: Round copy size to the nearest block in intel_miptree_copy</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: stop oversizing buffer resources</li>
</ul>
<p>Nanley Chery (2):</p>
<ul>
<li>anv/formats: Update the three-channel BC1 mappings</li>
<li>i965/formats: Update the three-channel DXT1 mappings</li>
</ul>
<p>Pohjolainen, Topi (1):</p>
<ul>
<li>intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4</li>
</ul>
<p>Samuel Iglesias Gonsálvez (3):</p>
<ul>
<li>i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution</li>
<li>i965/vec4: fix swizzle and writemask when loading an uniform with constant offset</li>
<li>i965/vec4: load dvec3/4 uniforms first in the push constant buffer</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>gallivm: Make sure module has the correct data layout when pass manager runs</li>
</ul>
</div>
</body>
</html>

224
docs/relnotes/17.1.0.html Normal file
View File

@@ -0,0 +1,224 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.0 Release Notes / May 10, 2017</h1>
<p>
Mesa 17.1.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for
<a href="../release-calendar.html#calendar" target="_parent">Mesa 17.1.1</a>.
</p>
<p>
Mesa 17.1.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c388069581a72853161657ac365f2c083afabd7cffd53f80513dacfa1cfa58a8 mesa-17.1.0.tar.gz
cf234a6ed4764673886b6661553b54675776ef0898f774716173cec890ac3b17 mesa-17.1.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>OpenGL 4.2 on i965/ivb</li>
<li>GL_ARB_gpu_shader_fp64 on i965/ivybridge</li>
<li>GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, llvmpipe</li>
<li>GL_ARB_shader_ballot on nvc0, radeonsi</li>
<li>GL_ARB_shader_clock on nv50, nvc0, radeonsi</li>
<li>GL_ARB_shader_group_vote on radeonsi</li>
<li>GL_ARB_shader_precision on i965/ivb</li>
<li>GL_ARB_shader_viewport_layer_array on radeonsi</li>
<li>GL_ARB_sparse_buffer on radeonsi/CIK+</li>
<li>GL_ARB_transform_feedback2 on i965/gen6</li>
<li>GL_ARB_transform_feedback_overflow_query on i965/gen6+</li>
<li>GL_ARB_vertex_attrib_64bit on i965/ivb</li>
<li>GL_NV_fill_rectangle on nvc0</li>
<li>Geometry shaders enabled on swr</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68504">Bug 68504</a> - 9.2-rc1 workaround for clover build failure on ppc/altivec: cannot convert 'bool' to '__vector(4) __bool int' in return</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=84325">Bug 84325</a> - X.Org segfaults when starting DE on an Intel+Radeon laptop, caused by libpciaccess cleanup, patch attached</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=93089">Bug 93089</a> - mesa fails to check for gcc atomic primitives before using them</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95460">Bug 95460</a> - Please add more drivers (freedreno, virgl) to features.txt status document</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96743">Bug 96743</a> - [BYT, HSW, SKL, BXT, KBL] GPU hangs with GfxBench 4.0 CarChase</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97102">Bug 97102</a> - [dri][swr] stack overflow / infinite loop with GALLIUM_DRIVER=swr</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97338">Bug 97338</a> - Black squares in the Spec Ops: The Line chapter select screen</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97524">Bug 97524</a> - Samplers referring to the same texture unit with different types should raise GL_INVALID_OPERATION</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97967">Bug 97967</a> - glsl/tests/cache-test regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97988">Bug 97988</a> - [radeonsi] playing back videos with VDPAU exhibits deinterlacing/anti-aliasing issues not visible with VA-API</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98263">Bug 98263</a> - [radv] The Talos Principle fails to launch with &quot;Fatal error: Cannot set display mode.&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98428">Bug 98428</a> - Undefined non-weak-symbol in dri-drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98502">Bug 98502</a> - Delay when starting firefox, thunderbird or chromium and dmesg spam</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98869">Bug 98869</a> - Electronic Super Joy graphic artefacts (regression,bisected)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98975">Bug 98975</a> - Wasteland 2 Directors Cut: Hangs. GPU fault</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99010">Bug 99010</a> - --disable-gallium-llvm no longer recognized</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99246">Bug 99246</a> - [d3dadapter+radeonsi &amp; bisect] EVE-Online : hang on wormhole sight</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99265">Bug 99265</a> - i965: Piglit egl_khr_gl_renderbuffer_image-clear-shared-image fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99339">Bug 99339</a> - Blender line rendering broken after removing XY clipping of lines</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99401">Bug 99401</a> - [g33] regression: piglit.spec.!opengl 1_0.gl-1_0-beginend-coverage</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99450">Bug 99450</a> - [amdgpu] Payday 2 visual glitches on some models</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99451">Bug 99451</a> - polygon offset use after free</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99456">Bug 99456</a> - Firefox crashing when opening about:support with WebGL2 enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99465">Bug 99465</a> - vtn_vector_construct writing out of bounds when given multiple non-zero length sources</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99484">Bug 99484</a> - Crusader Kings 2 - Loading bars, siege bars, morale bars, etc. do not render correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99532">Bug 99532</a> - Compute shader doesn't give right result under some circumstances</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99542">Bug 99542</a> - vdpau logging errors since gallium/radeon: adjust the rule for using the LINEAR_ALIGNED layout</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99631">Bug 99631</a> - segfault with OSVRTrackerView and openscenegraph git master</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99633">Bug 99633</a> - rasterizer/core/clip.h:279:49: error: const struct API_STATE has no member named linkageCount</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99660">Bug 99660</a> - Not all of the int64 conversion opcodes got implemented</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99677">Bug 99677</a> - heap-use-after-free in glsl</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99692">Bug 99692</a> - [radv] Mostly broken on Hawaii PRO/CIK ASICs</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99701">Bug 99701</a> - loader.c:353:8: error: implicit declaration of function 'geteuid' is invalid in C99 [-Werror,-Wimplicit-function-declaration]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99715">Bug 99715</a> - Don't print: &quot;Note: Buggy applications may crash, if they do please report to vendor&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99789">Bug 99789</a> - Memory leak on failure to create an ir_constant in calculate_iterations in loop_controls.cpp</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99817">Bug 99817</a> - [softpipe] piglit glsl-fs-tan-1 regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99842">Bug 99842</a> - GL_ARB_transform_feedback2 on i965 gen6</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99850">Bug 99850</a> - Tessellation bug on Carrizo</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99918">Bug 99918</a> - disk_cache.h:57:20: error: no member named 'st_mtim' in 'struct stat'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99953">Bug 99953</a> - device9.c:122:49: error: PIPE_CAP_USER_INDEX_BUFFERS undeclared (first use in this function)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99955">Bug 99955</a> - [r600g] GPU load always displayed at 100% with GALLIUM_HUD=GPU-load</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100026">Bug 100026</a> - piglit.spec.arb_shader_subroutine.compiler.direct-call_vert regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100049">Bug 100049</a> - &quot;ralloc: Make sure ralloc() allocations match malloc()'s alignment.&quot; causes seg fault in 32bit build</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100060">Bug 100060</a> - wsi/wsi_common_wayland.c:25:41: fatal error: wayland-drm-client-protocol.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100061">Bug 100061</a> - LODQ instruction generated with invalid dst mask</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100068">Bug 100068</a> - LLVM ERROR: Cannot select: intrinsic %llvm.amdgcn.buffer.load.format</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100088">Bug 100088</a> - piglit.spec.arb_get_texture_sub_image.arb_get_texture_sub_image regressions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100091">Bug 100091</a> - Failure to create folder for on-disk shader cache</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100133">Bug 100133</a> - swr_context.cpp:336:44: error: invalid conversion from uint {aka unsigned int} to pipe_render_cond_flag [-fpermissive]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100154">Bug 100154</a> - test_eu_compact regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100180">Bug 100180</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100182">Bug 100182</a> - Flickering in The Talos Principle on Sky Lake GT4.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100201">Bug 100201</a> - Windows scons build with MSVC toolchain and LLVM 4.0 fails</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100223">Bug 100223</a> - marshal_generated.c:38:10: fatal error: 'X11/Xlib-xcb.h' file not found</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100236">Bug 100236</a> - Undefined symbols for architecture x86_64: &quot;typeinfo for llvm::RTDyldMemoryManager&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100259">Bug 100259</a> - [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100288">Bug 100288</a> - clover unable to run OpenCL kernels since 03127bb radeonsi: compile all TGSI compute shaders asynchronously</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100303">Bug 100303</a> - Adding a single, meaningless if-else to a shader source leads to different image</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100391">Bug 100391</a> - SachaWillems deferredmultisampling asserts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100452">Bug 100452</a> - push_constants host memory leak when resetting command buffer</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100531">Bug 100531</a> - [regression] Broken graphics in several games</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100562">Bug 100562</a> - u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100569">Bug 100569</a> - core/resource.cpp:36:33: error: non-constant-expression cannot be narrowed from type 'int' to 'int16_t' (aka 'short') in initializer list [-Wc++11-narrowing]</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100574">Bug 100574</a> - anv_device.c:189: undefined reference to `anv_gem_supports_48b_addresses'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100582">Bug 100582</a> - [GEN8+] piglit.spec.arb_stencil_texturing.glblitframebuffer corrupts state.gl_texture* assertions</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100600">Bug 100600</a> - anv_device.c:1337: undefined reference to `anv_gem_busy'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100620">Bug 100620</a> - [SKL] 48-bit addresses break DOOM</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100663">Bug 100663</a> - commit 61e47d92c5196 breaks RS780</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100690">Bug 100690</a> - [Regression, bisected] TotalWar: Warhammer corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100892">Bug 100892</a> - Polaris 12: winsys init bad switch (missing break) initializing addrlib</li>
</ul>
<h2>Changes</h2>
<ul>
<li>Removed the ilo gallium driver.</li>
<li>The configure option --enable-gallium-llvm is superseded by --enable-llvm.</li>
<li>The swr driver now requires LLVM &gt;= 3.9.0 and a C++14 capable compiler.</li>
<li>The radeonsi driver now requires LLVM 3.8.0.</li>
<li>The MESA_GLSL=opt and MESA_GLSL=no_opt environment vars have been removed.</li>
<li>The --with-egl-platforms configure option is deprecated. Use --with-platforms instead.</li>
</ul>
</div>
</body>
</html>

188
docs/relnotes/17.1.1.html Normal file
View File

@@ -0,0 +1,188 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.1 Release Notes / March 25, 2017</h1>
<p>
Mesa 17.1.1 is a bug fix release which fixes bugs found since the 17.1.0 release.
</p>
<p>
Mesa 17.1.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
652315af87f2bb015ce99ee3b90d9d115d53cbf9e052493bd13d521a753b1930 mesa-17.1.1.tar.gz
aed503f94c0c1630a162a3e276f4ee12a86764cee4cb92338ea2dea99a04e7ef mesa-17.1.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100854">Bug 100854</a> - YUV to RGB Color Space Conversion result is not precise</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100925">Bug 100925</a> - [HSW/BSW/BDW/SKL] Google Earth is not resolving all the details in the map correctly</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: add new vega10 pci ids</li>
</ul>
<p>Andres Gomez (2):</p>
<ul>
<li>bin/get-fixes-pick-list.sh: don't warn if more than one, go over them</li>
<li>bin/get-fixes-pick-list.sh: bring back the warning</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: move msaa resolve to generalized StoreTile</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>egl: Partially revert 23c86c74, fix eglMakeCurrent</li>
</ul>
<p>Chih-Wei Huang (1):</p>
<ul>
<li>Android: correct libz dependency</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>gbm/dri: Fix sign-extension in modifier query</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.0</li>
<li>radeon: automake: remove unneeded elf Cflags/Libs</li>
<li>configure: remove unneeded bits around libunwind handling</li>
<li>egl: add g_egldispatchstubs.h to the release tarball</li>
<li>automake: add SWR LLVM gen_builder.hpp workaround</li>
<li>Update version to 17.1.1</li>
</ul>
<p>Eric Anholt (2):</p>
<ul>
<li>renderonly: Initialize fields of struct winsys_handle.</li>
<li>vc4: Don't allocate new BOs to avoid synchronization when they're shared.</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>anv: fix possible stack corruption</li>
<li>anv: don't leak DRM devices</li>
</ul>
<p>Hans de Goede (1):</p>
<ul>
<li>glxglvnddispatch: Add missing dispatch for GetDriverConfig</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nvc0/ir: SHLADD's middle source must be an immediate</li>
</ul>
<p>Johnson Lin (1):</p>
<ul>
<li>nir/lower_tex: Fix minor error in YUV color conversion matrix</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>bin/get-{extra,fixes}-pick-list.sh: add support for ignore list</li>
<li>bin/get-{extra,fixes}-pick-list.sh: improve output</li>
</ul>
<p>Lucas Stach (2):</p>
<ul>
<li>etnaviv: stop oversizing buffer resources</li>
<li>etnaviv: allow R/B swapped surfaces to be cleared</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>amd/addrlib: import Raven support</li>
<li>radeonsi/gfx9: add support for Raven</li>
</ul>
<p>Nanley Chery (2):</p>
<ul>
<li>anv/formats: Update the three-channel BC1 mappings</li>
<li>i965/formats: Update the three-channel DXT1 mappings</li>
</ul>
<p>Nicolai Hähnle (5):</p>
<ul>
<li>radeonsi: mark fast-cleared textures as compressed when dirtying</li>
<li>radeonsi: fix primitive ID in fragment shader when using tessellation</li>
<li>radeonsi: fix gl_PrimitiveID in tessellation with instanced draws on SI</li>
<li>radeonsi: fix gl_PrimitiveIDIn in geometry shader when using tessellation</li>
<li>st/mesa: remove an incorrect assertion</li>
</ul>
<p>Pohjolainen, Topi (1):</p>
<ul>
<li>intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>mesa/st: fix yuv EGLImage's</li>
<li>freedreno: fix crash when flush() but no rendering</li>
</ul>
<p>Rob Herring (1):</p>
<ul>
<li>virgl: fix virgl_bo_transfer_{put, get} box struct copy</li>
</ul>
<p>Samuel Iglesias Gonsálvez (3):</p>
<ul>
<li>i965/vec4/gs: restore the uniform values which was overwritten by failed vec4_gs_visitor execution</li>
<li>i965/vec4: fix swizzle and writemask when loading an uniform with constant offset</li>
<li>i965/vec4: load dvec3/4 uniforms first in the push constant buffer</li>
</ul>
<p>Tom Stellard (1):</p>
<ul>
<li>gallivm: Make sure module has the correct data layout when pass manager runs</li>
</ul>
</div>
</body>
</html>

155
docs/relnotes/17.1.10.html Normal file
View File

@@ -0,0 +1,155 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.10 Release Notes / September 25, 2017</h1>
<p>
Mesa 17.1.10 is a bug fix release which fixes bugs found since the 17.1.9 release.
</p>
<p>
Mesa 17.1.10 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
a48ce6b643a728b2b0f926151930525b3670fbff1fb688527fd9051eab9f30a4 mesa-17.1.10.tar.gz
cbc0d681cc4df47d8deb5a36f45b420978128522fd665b2cd4c7096316f11bdb mesa-17.1.10.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>
</ul>
<h2>Changes</h2>
<p>Alexandre Demers (1):</p>
<ul>
<li>osmesa: link with libunwind if enabled (v2)</li>
</ul>
<p>Andres Gomez (12):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.9</li>
<li>cherry-ignore: add "st/mesa: skip draw calls with pipe_draw_info::count == 0"</li>
<li>cherry-ignore: add "radv: use amdgpu_bo_va_op_raw."</li>
<li>cherry-ignore: add "radv: use simpler indirect packet 3 if possible."</li>
<li>cherry-ignore: add "radeonsi: don't always apply the PrimID instancing bug workaround on SI"</li>
<li>cherry-ignore: add "intel/eu/validate: Look up types on demand in execution_type()"</li>
<li>cherry-ignore: add "radv: gfx9 fixes"</li>
<li>cherry-ignore: add "radv/gfx9: set mip0-depth correctly for 2d arrays/3d images"</li>
<li>cherry-ignore: add "radv/gfx9: fix image resource handling."</li>
<li>cherry-ignore: add "docs/egl: remove reference to EGL_DRIVERS_PATH"</li>
<li>cherry-ignore: add "radv: Disable multilayer &amp; multilevel DCC."</li>
<li>cherry-ignore: add "radv: Don't allocate CMASK for linear images."</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv/ac: bump params array for image atomic comp swap</li>
<li>st/glsl-&gt;tgsi: fix u64 to bool comparisons.</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension</li>
<li>automake: enable libunwind in `make distcheck'</li>
</ul>
<p>Eric Anholt (3):</p>
<ul>
<li>broadcom/vc4: Fix use-after-free for flushing when writing to a texture.</li>
<li>broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.</li>
<li>broadcom/vc4: Fix use-after-free when deleting a program.</li>
</ul>
<p>George Kyriazis (1):</p>
<ul>
<li>swr: invalidate attachment on transition change</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>travis: force llvm-3.3 for "make Gallium ST Other"</li>
<li>travis: Add libunwind-dev to gallium/make builds</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965/blorp: Set r8stencil_needs_update when writing stencil</li>
</ul>
<p>Juan A. Suarez Romero (9):</p>
<ul>
<li>cherry-ignore: add "ac/surface: match Z and stencil tile config"</li>
<li>cherry-ignore: add "radv/nir: call opt_remove_phis after trivial continues."</li>
<li>cherry-ignore: add "amd/common: add workaround for cube map array layer clamping"</li>
<li>cherry-ignore: add "radeonsi: workaround for gather4 on integer cube maps"</li>
<li>cherry-ignore: add "Scons: Add LLVM 5.0 support"</li>
<li>cherry-ignore: add "ac/surface: handle S8 on gfx9"</li>
<li>cherry-ignore: add "radv: Check for GFX9 for 1D arrays in image_size intrinsic."</li>
<li>cherry-ignore: add "glsl/linker: fix output variable overlap check"</li>
<li>Update version to 17.1.10</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>anv: Fix descriptors copying</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Link libmesautil into u_atomic_test</li>
<li>util/u_atomic: Add implementation of __sync_val_compare_and_swap_8</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D index</li>
<li>amd/common: round cube array slice in ac_prepare_cube_coords</li>
<li>radeonsi: set MIP_POINT_PRECLAMP to 0</li>
<li>radeonsi: fix array textures layer coordinate</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: free current ComputeProgram state in _mesa_free_context_data</li>
</ul>
</div>
</body>
</html>

187
docs/relnotes/17.1.2.html Normal file
View File

@@ -0,0 +1,187 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.2 Release Notes / June 5, 2017</h1>
<p>
Mesa 17.1.2 is a bug fix release which fixes bugs found since the 17.1.1 release.
</p>
<p>
Mesa 17.1.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
0d2020c2115db0d13a5be0075abf0da143290f69f5817a2f277861e89166a3e1 mesa-17.1.2.tar.gz
0937804f43746339b1f9540d8f9c8b4a1bb3d3eec0e4020eac283b8799798239 mesa-17.1.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98833">Bug 98833</a> - [REGRESSION, bisected] Wayland revert commit breaks non-Vsync fullscreen frame updates</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100741">Bug 100741</a> - Chromium - Memory leak</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100877">Bug 100877</a> - vulkan/tests/block_pool_no_free regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101110">Bug 101110</a> - Build failure in GNOME Continuous</li>
</ul>
<h2>Changes</h2>
<p>Bartosz Tomczyk (1):</p>
<ul>
<li>mesa: Avoid leaking surface in st_renderbuffer_delete</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Reserve space for descriptor and push constant user SGPR setting.</li>
</ul>
<p>Daniel Stone (7):</p>
<ul>
<li>vulkan: Fix Wayland uninitialised registry</li>
<li>vulkan/wsi/wayland: Remove roundtrip when creating image</li>
<li>vulkan/wsi/wayland: Use per-display event queue</li>
<li>vulkan/wsi/wayland: Use proxy wrappers for swapchain</li>
<li>egl/wayland: Don't open-code roundtrip</li>
<li>egl/wayland: Use per-surface event queues</li>
<li>egl/wayland: Ensure we get a back buffer</li>
</ul>
<p>Emil Velikov (24):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.1</li>
<li>configure: move platform handling further up</li>
<li>configure: rename remaining HAVE_EGL_PLATFORM_* guards</li>
<li>configure: update remaining --with-egl-platforms references</li>
<li>configure: loosen --with-platforms heuristics</li>
<li>configure: enable the surfaceless platform by default</li>
<li>configure: set HAVE_foo_PLATFORM as applicable</li>
<li>configure: error out when building GLX w/o the X11 platform</li>
<li>configure: check once for DRI3 dependencies</li>
<li>loader: build libloader_dri3_helper.la only with HAVE_PLATFORM_X11</li>
<li>configure: error out when building X11 Vulkan without DRI3</li>
<li>auxiliary/vl: use vl_*_screen_create stubs when building w/o platform</li>
<li>st/va: fix misplaced closing bracket</li>
<li>st/omx: remove unneeded X11 include</li>
<li>st/omx: fix building against X11-less setups</li>
<li>gallium/targets: link against XCB only as needed</li>
<li>configure: error out if building VA w/o supported platform</li>
<li>configure: error out if building OMX w/o supported platform</li>
<li>configure: error out if building VDPAU w/o supported platform</li>
<li>configure: error out if building XVMC w/o supported platform</li>
<li>travis: remove workarounds for the Vulkan target</li>
<li>anv: automake: list shared libraries after the static ones</li>
<li>radv: automake: list shared libraries after the static ones</li>
<li>egl/wayland: select the format based on the interface used</li>
</ul>
<p>Ian Romanick (3):</p>
<ul>
<li>r100: Don't assume that the base mipmap of a texture exists</li>
<li>r100,r200: Don't assume glVisual is non-NULL during context creation</li>
<li>r100: Use _mesa_get_format_base_format in radeon_update_wrapper</li>
</ul>
<p>Jason Ekstrand (17):</p>
<ul>
<li>anv: Handle color layout transitions from the UNINITIALIZED layout</li>
<li>anv: Handle transitioning depth from UNDEFINED to other layouts</li>
<li>anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hack</li>
<li>anv: Predicate 48bit support on gen &gt;= 8</li>
<li>anv: Set up memory types and heaps during physical device init</li>
<li>anv: Set image memory types based on the type count</li>
<li>i965/blorp: Do and end-of-pipe sync on both sides of fast-clear ops</li>
<li>i965: Round copy size to the nearest block in intel_miptree_copy</li>
<li>anv: Set EXEC_OBJECT_ASYNC when available</li>
<li>anv: Determine the type of mapping based on type metadata</li>
<li>anv: Add valid_bufer_usage to the memory type metadata</li>
<li>anv: Stop setting BO flags in bo_init_new</li>
<li>anv: Make supports_48bit_addresses a heap property</li>
<li>anv: Refactor memory type setup</li>
<li>anv: Advertise both 32-bit and 48-bit heaps when we have enough memory</li>
<li>i965: Rework Sandy Bridge HiZ and stencil layouts</li>
<li>anv: Require vertex buffers to come from a 32-bit heap</li>
</ul>
<p>Juan A. Suarez Romero (13):</p>
<ul>
<li>Revert "android: fix segfault within swap_buffers"</li>
<li>cherry-ignore: radeonsi: load patch_id for TES-as-ES when exporting for PS</li>
<li>cherry-ignore: anv: Determine the type of mapping based on type metadata</li>
<li>cherry-ignore: anv: Stop setting BO flags in bo_init_new</li>
<li>cherry-ignore: anv: Make supports_48bit_addresses a heap property</li>
<li>cherry-ignore: anv: Advertise both 32-bit and 48-bit heaps when we have enough memory</li>
<li>cherry-ignore: anv: Require vertex buffers to come from a 32-bit heap</li>
<li>cherry-ignore: radv: fix regression in descriptor set freeing</li>
<li>cherry-ignore: anv: Add valid_bufer_usage to the memory type metadata</li>
<li>cherry-ignore: anv: Refactor memory type setup</li>
<li>Revert "cherry-ignore: anv: [...]"</li>
<li>Revert "cherry-ignore: anv: Require vertex buffers to come from a 32-bit heap"</li>
<li>Update version to 17.1.2</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi/gfx9: compile shaders with +xnack</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>st/mesa: remove redundant stfb-&gt;iface checks</li>
</ul>
<p>Nicolas Boichat (1):</p>
<ul>
<li>configure.ac: Also match -androideabi tuple</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: fix fence creation fail if no rendering</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>egl/android: fix segfault within swap_buffers</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>st/mesa: don't mark the program as in cache_fallback when there is cache miss</li>
</ul>
</div>
</body>
</html>

156
docs/relnotes/17.1.3.html Normal file
View File

@@ -0,0 +1,156 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.3 Release Notes / June 19, 2017</h1>
<p>
Mesa 17.1.3 is a bug fix release which fixes bugs found since the 17.1.2 release.
</p>
<p>
Mesa 17.1.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
81ae9127286ff8d631e466d258608d6dea9854fe7bee2e8521da44c7544f01e5 mesa-17.1.3.tar.gz
5f1ee9a8aea2880f887884df2dea0c16dd1b13eb42fd2e52265db0dc1b380e8c mesa-17.1.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100988">Bug 100988</a> - glXGetCurrentDisplay() no longer works for FakeGLX contexts?</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Set both compute and graphics SGPRS on descriptor set flush.</li>
<li>radv: Dirty all descriptors sets when changing the pipeline.</li>
<li>radv: Remove SI num RB override for occlusion queries.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>xlib: fix glXGetCurrentDisplay() failure</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>i965/dri: Fix bad GL error in intel_create_winsys_renderbuffer()</li>
</ul>
<p>Chuck Atkins (1):</p>
<ul>
<li>configure.ac: Reduce zlib requirement from 1.2.8 to 1.2.3.</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv: expose integrated device type for APUs.</li>
<li>radv: set fmask state to all 0s when no fmask. (v2)</li>
<li>glsl/lower_distance: only set max_array_access for 1D clip dist arrays</li>
</ul>
<p>Emil Velikov (1):</p>
<ul>
<li>Update version to 17.1.3</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: fix trace dumping for !use_ib_bos</li>
</ul>
<p>Jason Ekstrand (4):</p>
<ul>
<li>i965/blorp: Take a layer range in intel_hiz_exec</li>
<li>i965: Move the pre-depth-clear flush/stalls to intel_hiz_exec</li>
<li>i965: Perform HiZ flush/stall prior to HiZ resolves</li>
<li>i965: Mark depth surfaces as needing a HiZ resolve after blitting</li>
</ul>
<p>José Fonseca (1):</p>
<ul>
<li>automake: Link all libGL.so variants with -Bsymbolic.</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.2</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: always do cpu_fini in transfer_unmap</li>
</ul>
<p>Lyude (1):</p>
<ul>
<li>nvc0: disable BGRA8 images on Fermi</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>st/mesa: don't load cached TGSI shaders on demand</li>
<li>radeonsi: fix a GPU hang with tessellation on 2-CU configs</li>
<li>radeonsi: disable the patch ID workaround on SI when the patch ID isn't used (v2)</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radv: fewer than 8 RBs are possible</li>
</ul>
<p>Nicolas Dechesne (1):</p>
<ul>
<li>util/rand_xor: add missing include statements</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>egl: fix _eglQuerySurface in EGL_BUFFER_AGE_EXT case</li>
</ul>
<p>Thomas Hellstrom (1):</p>
<ul>
<li>dri3/GLX: Fix drawable invalidation v2</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr: relax c++ requirement from c++14 to c++11</li>
</ul>
</div>
</body>
</html>

220
docs/relnotes/17.1.4.html Normal file
View File

@@ -0,0 +1,220 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.4 Release Notes / June 30, 2017</h1>
<p>
Mesa 17.1.4 is a bug fix release which fixes bugs found since the 17.1.3 release.
</p>
<p>
Mesa 17.1.4 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
f82fbbdf2dcec0e7e5aa3a8fe4bacd50bf4b7293cc6e1a56658ae6504d732362 mesa-17.1.4.tar.gz
06f3b0e6a28f0d20b7f3391cf67fe89ae98ecd0a686cd545da76557b6cec9cad mesa-17.1.4.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77240">Bug 77240</a> - khrplatform.h not installed if EGL is disabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95530">Bug 95530</a> - Stellaris - colored overlay of sectors doesn't render on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96958">Bug 96958</a> - [SKL] Improper rendering in Europa Universalis IV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99467">Bug 99467</a> - [radv] DOOM 2016 + wine. Green screen everywhere (but can be started)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101071">Bug 101071</a> - compiling glsl fails with undefined reference to `pthread_create'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101252">Bug 101252</a> - eglGetDisplay() is not thread safe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101294">Bug 101294</a> - radeonsi minecraft forge splash freeze since 17.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101451">Bug 101451</a> - [G33] ES2-CTS.functional.clipping.polygon regression</li>
</ul>
<h2>Changes</h2>
<p>Alex Deucher (1):</p>
<ul>
<li>radeonsi: add new polaris12 pci id</li>
</ul>
<p>Andres Gomez (3):</p>
<ul>
<li>cherry-ignore: 17.1.4 rejected commits</li>
<li>cherry-ignore: bin/get-fixes-pick-list.sh: better identify multiple "fixes:" tags</li>
<li>Update version to 17.1.4</li>
</ul>
<p>Anuj Phogat (2):</p>
<ul>
<li>i965: Add and initialize l3_banks field for gen7+</li>
<li>i965: Fix broxton 2x6 l3 config</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>egl_dri2: swrastGetDrawableInfo: set *x, common.py [v2]</li>
</ul>
<p>Brian Paul (2):</p>
<ul>
<li>svga: check return value from svga_set_shader( SVGA3D_SHADERTYPE_GS, NULL)</li>
<li>gallium/vbuf: avoid segfault when we get invalid glDrawRangeElements()</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>egl/android: Change order of EGLConfig generation (v2)</li>
</ul>
<p>Chandu Babu N (1):</p>
<ul>
<li>change va max_entrypoints</li>
</ul>
<p>Charmaine Lee (1):</p>
<ul>
<li>svga: use the winsys interface to invalidate surface</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.3</li>
<li>configure.ac: add -pthread to PTHREAD_LIBS</li>
<li>radeonsi: include ac_binary.h for struct ac_shader_binary</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>egl: properly count configs</li>
<li>egl/display: only detect the platform once</li>
<li>egl/display: make platform detection thread-safe</li>
</ul>
<p>Eric Le Bihan (1):</p>
<ul>
<li>Fix khrplatform.h not installed if EGL is disabled.</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>i965: update MaxTextureRectSize to match PRMs and comply with OpenGL 4.1+</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50/ir: fetch indirect sources BEFORE the op that uses them</li>
<li>nv50/ir: fix combineLd/St to update existing records as necessary</li>
</ul>
<p>Jason Ekstrand (10):</p>
<ul>
<li>i965: Flush around state base address</li>
<li>i965: Take a uint64_t immediate in emit_pipe_control_write</li>
<li>i965: Unify the two emit_pipe_control functions</li>
<li>i965: Do an end-of-pipe sync prior to STATE_BASE_ADDRESS</li>
<li>i965/blorp: Do an end-of-pipe sync around CCS ops</li>
<li>i965: Do an end-of-pipe sync after flushes</li>
<li>i965: Disable the interleaved vertex optimization when instancing</li>
<li>i965: Set step_rate = 0 for interleaved vertex buffers</li>
<li>spirv: Work around the Doom shader bug</li>
<li>i965: Clamp clear colors to the representable range</li>
</ul>
<p>Jonas Kulla (1):</p>
<ul>
<li>anv: Fix L3 cache programming on Bay Trail</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Ignore anisotropic filtering in nearest mode.</li>
</ul>
<p>Lucas Stach (7):</p>
<ul>
<li>etnaviv: don't try RS blit if blit region is unaligned</li>
<li>etnaviv: use padded width/height for resource copies</li>
<li>etnaviv: remove bogus assert</li>
<li>etnaviv: replace translate_clear_color with util_pack_color</li>
<li>etnaviv: mask correct channel for RB swapped rendertargets</li>
<li>etnaviv: advertise correct max LOD bias</li>
<li>etnaviv: only flush resource to self if no scanout buffer exists</li>
</ul>
<p>Marek Olšák (4):</p>
<ul>
<li>winsys/amdgpu: fix a deadlock when waiting for submission_in_progress</li>
<li>mesa: flush vertices before changing viewports</li>
<li>mesa: flush vertices before updating ctx-&gt;_Shader</li>
<li>st/mesa: fix pipe_rasterizer_state::scissor with multiple viewports</li>
</ul>
<p>Michel Dänzer (1):</p>
<ul>
<li>gallium/util: Break recursion in pipe_resource_reference</li>
</ul>
<p>Nicolai Hähnle (2):</p>
<ul>
<li>gallium/radeon/gfx9: fix PBO texture uploads to compressed textures</li>
<li>amd/common: fix off-by-one in sid_tables.py</li>
</ul>
<p>Pierre Moreau (1):</p>
<ul>
<li>nv50/ir: Properly fold constants in SPLIT operation</li>
</ul>
<p>Rob Herring (1):</p>
<ul>
<li>Android: major/minor/makedev live in &lt;sys/sysmacros.h&gt;</li>
</ul>
<p>Topi Pohjolainen (2):</p>
<ul>
<li>i965: Add an end-of-pipe sync helper</li>
<li>i965/gen4: Set depth offset when there is stencil attachment only</li>
</ul>
<p>Ville Syrjälä (2):</p>
<ul>
<li>i915: Fix gl_Fragcoord interpolation</li>
<li>i915: Fix wpos_tex vs. -1 comparison</li>
</ul>
</div>
</body>
</html>

203
docs/relnotes/17.1.5.html Normal file
View File

@@ -0,0 +1,203 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.5 Release Notes / July 14, 2017</h1>
<p>
Mesa 17.1.5 is a bug fix release which fixes bugs found since the 17.1.4 release.
</p>
<p>
Mesa 17.1.5 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7e3eeee8f9c28052796eb18133c2be12c38ba34864cc496382a2fa20c29b0317 mesa-17.1.5.tar.gz
378516b171712687aace4c7ea8b37c85895231d7a6d61e1e27362cf6034fded9 mesa-17.1.5.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100242">Bug 100242</a> - radeon buffer allocation failure during startup of Factorio</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101657">Bug 101657</a> - strtod.c:32:10: fatal error: xlocale.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101666">Bug 101666</a> - bitfieldExtract is marked as a built-in function on OpenGL ES 3.0, but was added in OpenGL ES 3.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101703">Bug 101703</a> - No stencil buffer allocated when requested by GLUT</li>
</ul>
<h2>Changes</h2>
<p>Aaron Watry (1):</p>
<ul>
<li>radeon/winsys: Limit max allocation size to 70% of VRAM</li>
</ul>
<p>Aleksander Morgado (2):</p>
<ul>
<li>etnaviv: fix refcnt initialization in etna_screen</li>
<li>etnaviv: don't dereference etna_resource pointer if allocation fails</li>
</ul>
<p>Alex Smith (2):</p>
<ul>
<li>ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers</li>
<li>ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics</li>
</ul>
<p>Andres Gomez (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.4</li>
<li>cherry-ignore: i965: Fix anisotropic filtering for mag filter</li>
<li>Update version to 17.1.5</li>
</ul>
<p>Anuj Phogat (2):</p>
<ul>
<li>intel/isl: Use uint64_t to store total surface size</li>
<li>intel/isl: Add the maximum surface size limit</li>
</ul>
<p>Brian Paul (3):</p>
<ul>
<li>draw: check for line_width != 1.0f in validate_pipeline()</li>
<li>svga: clamp device line width to at least 1 to fix HWv8 line stippling</li>
<li>svga: fix PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE value</li>
</ul>
<p>Bruce Cherniak (1):</p>
<ul>
<li>swr: Limit memory held by defer deleted resources.</li>
</ul>
<p>Chandu Babu N (1):</p>
<ul>
<li>st/va: Fix leak in VAAPI subpictures</li>
</ul>
<p>Charmaine Lee (1):</p>
<ul>
<li>svga: fixed surface size to include array size</li>
</ul>
<p>Connor Abbott (2):</p>
<ul>
<li>spirv: fix OpBitcast when the src and dst bitsize are different (v3)</li>
<li>ac/nir: implement 64-bit packing and unpacking</li>
</ul>
<p>Iago Toral Quiroga (1):</p>
<ul>
<li>glsl: gl_Max{Vertex,Fragment}UniformComponents exist in all desktop GL versions</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>glsl: check if any of the named builtins are available first</li>
</ul>
<p>James Legg (2):</p>
<ul>
<li>ac/nir: Make intrinsic_name buffer long enough</li>
<li>spirv: Fix reaching unreachable for compare exchange on images</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>nir/spirv: Use the type from the deref for atomics</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>glsl: do not call link_xfb_stride_layout_qualifiers() for fragment shaders</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Use true AA line distance on G45/Ironlake.</li>
<li>i965: Always set AALINEDISTANCE_TRUE on Sandybridge.</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix shader miscompilation with more than 16 labels</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>gallium/radeon: fix a possible crash for buffer exports</li>
</ul>
<p>Neha Bhende (1):</p>
<ul>
<li>svga: loop over box.depth for ReadBack_image on each slice</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>winsys/radeon: only call pb_slabs_reclaim when slabs are actually used</li>
</ul>
<p>Olivier Lauffenburger (1):</p>
<ul>
<li>st/wgl: improve selection of pixel format</li>
</ul>
<p>Philipp Zabel (1):</p>
<ul>
<li>st/mesa: release EGLImage on EGLImageTarget* error</li>
</ul>
<p>Plamena Manolova (1):</p>
<ul>
<li>mesa/main: Move NULL pointer check.</li>
</ul>
<p>Tim Rowley (2):</p>
<ul>
<li>swr/rast: _mm*_undefined_* implementations for gcc&lt;4.9</li>
<li>swr/rast: Correctly allocate SWR_STATS memory as cacheline aligned</li>
</ul>
<p>Tomasz Figa (1):</p>
<ul>
<li>intel: common: Fix link failure with standalone Android build</li>
</ul>
<p>Vinson Lee (1):</p>
<ul>
<li>scons: Check for xlocale.h before defining HAVE_XLOCALE_H.</li>
</ul>
</div>
</body>
</html>

225
docs/relnotes/17.1.6.html Normal file
View File

@@ -0,0 +1,225 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.6 Release Notes / August 7, 2017</h1>
<p>
Mesa 17.1.6 is a bug fix release which fixes bugs found since the 17.1.5 release.
</p>
<p>
Mesa 17.1.6 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
971831bc1e748b3e8367eee6b9eb509bad2970e3c2f8520ad25f5caa12ca5491 mesa-17.1.6.tar.gz
0686deadde1f126b20aa67e47e8c50502043eee4ecdf60d5009ffda3cebfee50 mesa-17.1.6.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97957">Bug 97957</a> - Awful screen tearing in a separate X server with DRI3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101683">Bug 101683</a> - Some games hang while loading when compositing is shut off or absent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.5</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Don't underflow non-visible VRAM size.</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>svga: fix texture swizzle writemasking</li>
</ul>
<p>Chad Versace (1):</p>
<ul>
<li>anv/image: Fix VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT</li>
</ul>
<p>Chris Wilson (1):</p>
<ul>
<li>i965: Resolve framebuffers before signaling the fence</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>nir: fix algebraic optimizations</li>
</ul>
<p>Daniel Stone (1):</p>
<ul>
<li>st/dri: Check get-handle return value in queryImage</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>radv: fix non-0 based layer clears.</li>
<li>radv: fix buffer views on SI/CIK.</li>
<li>radv/ac: realign SI workaround with radeonsi.</li>
<li>radv/ac: port SI TC L1 write corruption fix.</li>
<li>radv: for stencil only set Z tile mode index to same value</li>
</ul>
<p>Emil Velikov (23):</p>
<ul>
<li>cherry-ignore: add "anv: Round u_vector element sizes to a power of two"</li>
<li>anv: advertise v6 of the wayland surface extension</li>
<li>radv: advertise v6 of the wayland surface extension</li>
<li>swrast: add dri2ConfigQueryExtension to the correct extension list</li>
<li>cherry-ignore: add "anv: Transition MCS buffers from the undefined layout"</li>
<li>swr: don't forget to link AVX/AVX2 against pthreads</li>
<li>cherry-ignore: add "i965: Fix offset addition in get_isl_surf"</li>
<li>cherry-ignore: add "i965: Fix = vs == in MCS aux usage assert."</li>
<li>cherry-ignore: add a couple of radeon commits</li>
<li>cherry-ignore: add "swr/rast: non-regex knob fallback code for gcc &lt; 4.9"</li>
<li>cherry-ignore: add "swr: fix transform feedback logic"</li>
<li>cherry-ignore: add a couple of radeonsi/gfx9 commits</li>
<li>cherry-ignore: ignore reverted st/mesa commit</li>
<li>cherry-ignore: add bindless textures fix</li>
<li>cherry-ignore: add "st/glsl_to_tgsi: fix getting the image type for array of structs"</li>
<li>cherry-ignore: add yet another bindless textures fix</li>
<li>bin/cherry-ignore: add radeonsi "fix of a fix"</li>
<li>travis: lower SWR requirement to GCC 4.8, aka std=c++11</li>
<li>i965: use strtol to convert the integer deviceID override</li>
<li>swr: remove unneeded fallback strcasecmp define</li>
<li>cherry-ignore: add a bunch more commits to the list</li>
<li>fixup! cherry-ignore: add a bunch more commits to the list</li>
<li>Update version to 17.1.6</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>broadcom/vc4: Prefer blit via rendering to the software fallback.</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>configure: only install khrplatform.h if needed</li>
</ul>
<p>Iago Toral Quiroga (2):</p>
<ul>
<li>anv/cmd_buffer: fix off by one error in assertion</li>
<li>anv: only expose up to 28 vertex attributes</li>
</ul>
<p>Ilia Mirkin (1):</p>
<ul>
<li>nv50/ir: fix threads calculation for non-compute shaders</li>
</ul>
<p>Jason Ekstrand (5):</p>
<ul>
<li>anv/cmd_buffer: Properly handle render passes with 0 attachments</li>
<li>anv: Stop leaking the no_aux sampler surface state</li>
<li>anv/image: Add INPUT_ATTACHMENT to the list of required usages</li>
<li>nir/vars_to_ssa: Handle missing struct members in foreach_deref_node</li>
<li>spirv: Fix SpvImageFormatR16ui</li>
</ul>
<p>Juan A. Suarez Romero (2):</p>
<ul>
<li>anv/pipeline: use unsigned long long constant to check enable vertex inputs</li>
<li>anv/pipeline: do not use BITFIELD64_BIT()</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>nir: Use nir_src_copy instead of direct assignments.</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: flush batchbuffers at the beginning of queries</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix memory leak when BO allocation fails</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>st/mesa: always unconditionally revalidate main framebuffer after SwapBuffers</li>
<li>gallium/radeon: make S_FIXED function signed and move it to shared code</li>
</ul>
<p>Mark Thompson (1):</p>
<ul>
<li>st/va: Fix scaling list ordering for H.265</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi/gfx9: fix crash building monolithic merged ES-GS shader</li>
<li>radeonsi: fix detection of DRAW_INDIRECT_MULTI on SI</li>
<li>radeonsi/gfx9: reduce max threads per block to 1024 on gfx9+</li>
<li>gallium/radeon: fix ARB_query_buffer_object conversion to boolean</li>
</ul>
<p>Thomas Hellstrom (2):</p>
<ul>
<li>loader/dri3: Use dri3_find_back in loader_dri3_swap_buffers_msc</li>
<li>dri3: Wait for all pending swapbuffers to be scheduled before touching the front</li>
</ul>
<p>Tim Rowley (3):</p>
<ul>
<li>gallium/util: fix nondeterministic avx512 detection</li>
<li>swr/rast: quit using linux-specific gettid()</li>
<li>swr/rast: fix scons gen_knobs.h dependency</li>
</ul>
<p>Timothy Arceri (1):</p>
<ul>
<li>nir: fix nir_opt_copy_prop_vars() for arrays of arrays</li>
</ul>
<p>Wladimir J. van der Laan (1):</p>
<ul>
<li>etnaviv: Clear lbl_usage array correctly</li>
</ul>
</div>
</body>
</html>

148
docs/relnotes/17.1.7.html Normal file
View File

@@ -0,0 +1,148 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.7 Release Notes / August 21, 2017</h1>
<p>
Mesa 17.1.7 is a bug fix release which fixes bugs found since the 17.1.6 release.
</p>
<p>
Mesa 17.1.7 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
7ca484fe3194e8185d9a20261845bfd284cc40d0f3fda690d317f85ac7b91af5 mesa-17.1.7.tar.gz
69f472a874b1122404fa0bd13e2d6bf87eb3b9ad9c21d2f39872a96d83d9e5f5 mesa-17.1.7.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>cherry-ignore: add "swr: use the correct variable for no undefined symbols"</li>
<li>cherry-ignore: add "radeon/ac: use ds_swizzle for derivs on si/cik."</li>
<li>cherry-ignore: add "configure: remove trailing "-a" in swr architecture teststable: 17.2 nomination only."</li>
<li>cherry-ignore: added 17.2 nominations.</li>
<li>cherry-ignore: add "radv: Handle VK_ATTACHMENT_UNUSED in color attachments."</li>
<li>cherry-ignore: add "virgl: drop precise modifier."</li>
<li>cherry-ignore: add "radv: handle 10-bit format clamping workaround."</li>
<li>Update version to 17.1.7</li>
</ul>
<p>Chris Wilson (1):</p>
<ul>
<li>i965/blit: Remember to include miptree buffer offset in relocs</li>
</ul>
<p>Connor Abbott (1):</p>
<ul>
<li>ac/nir: fix lsb emission</li>
</ul>
<p>Dave Airlie (5):</p>
<ul>
<li>intel/vec4/gs: reset nr_pull_param if DUAL_INSTANCED compile failed.</li>
<li>radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)</li>
<li>radv: fix f16-&gt;f32 denorm handling for SI/CIK. (v2)</li>
<li>radv: fix MSAA on SI gpus.</li>
<li>radv: force cs/ps/l2 flush at end of command stream. (v2)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.6</li>
<li>egl/x11: don't leak xfixes_query in the error path</li>
<li>egl: avoid eglCreatePlatform*Surface{EXT,} crash with invalid dpy</li>
</ul>
<p>Eric Anholt (1):</p>
<ul>
<li>util: Fix build on old glibc.</li>
</ul>
<p>Frank Richter (3):</p>
<ul>
<li>st/mesa: fix a null pointer access</li>
<li>st/wgl: check for negative delta in wait_swap_interval()</li>
<li>gallium/os: fix os_time_get_nano() to roll over less</li>
</ul>
<p>Ilia Mirkin (3):</p>
<ul>
<li>glsl/ast: update rhs in addition to the var's constant_value</li>
<li>nv50/ir: fix srcMask computation for TG4 and TXF</li>
<li>nv50/ir: fix TXQ srcMask</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>anv/formats: Allow sampling on depth-only formats on gen7</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nv50/ir: fix ConstantFolding with saturation</li>
</ul>
<p>Kenneth Graunke (1):</p>
<ul>
<li>i965: Delete pitch alignment assertion in get_blit_intratile_offset_el.</li>
</ul>
<p>Marek Olšák (2):</p>
<ul>
<li>ac: fail shader compilation if libelf is replaced by an incompatible version</li>
<li>radeonsi: disable CE by default</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: Fix invalid casting for calls to Interlocked* functions</li>
</ul>
</div>
</body>
</html>

115
docs/relnotes/17.1.8.html Normal file
View File

@@ -0,0 +1,115 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.8 Release Notes / August 28, 2017</h1>
<p>
Mesa 17.1.8 is a bug fix release which fixes bugs found since the 17.1.7 release.
</p>
<p>
Mesa 17.1.8 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
faa59a677e88fd5224cdfebcdb6ca9ad3e3c64bd562baa8d5c3c1faeef1066b6 mesa-17.1.8.tar.gz
75ed2eaeae26ddd536150f294386468ae2e1a7717948c41cd14b7875be5269db mesa-17.1.8.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (6):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.7</li>
<li>cherry-ignore: cherry-ignore: added 17.2 nominations.</li>
<li>cherry-ignore: add "i965/tex: Don't pass samples to miptree_create_for_teximage"</li>
<li>cherry-ignore: add "i965: Make a BRW_NEW_FAST_CLEAR_COLOR dirty bit."</li>
<li>cherry-ignore: add "egl/drm: Fix misused x and y offsets in swrast_*_image*"</li>
<li>Update version to 17.1.8</li>
</ul>
<p>Christoph Haag (1):</p>
<ul>
<li>mesa: only copy requested compressed teximage cubemap faces</li>
</ul>
<p>Dave Airlie (1):</p>
<ul>
<li>radv: don't crash if we have no framebuffer</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>glsl: add a few missing int64 constant propagation cases</li>
<li>nv50/ir: properly set sType for TXF ops to U32</li>
</ul>
<p>Jason Ekstrand (1):</p>
<ul>
<li>i965: Stop looking at NewDriverState when emitting 3DSTATE_URB</li>
</ul>
<p>Kai Chen (1):</p>
<ul>
<li>egl/wayland: Use roundtrips when awaiting buffer release</li>
</ul>
<p>Lionel Landwerlin (1):</p>
<ul>
<li>i965: perf: minimize the chances to spread queries across batchbuffers</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>radeonsi/gfx9: add a temporary workaround for a tessellation driver bug</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: switch gen_knobs.cpp license</li>
</ul>
<p>Topi Pohjolainen (1):</p>
<ul>
<li>intel/blorp: Adjust intra-tile x when faking rgb with red-only</li>
</ul>
</div>
</body>
</html>

144
docs/relnotes/17.1.9.html Normal file
View File

@@ -0,0 +1,144 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.1.9 Release Notes / September 8, 2017</h1>
<p>
Mesa 17.1.9 is a bug fix release which fixes bugs found since the 17.1.8 release.
</p>
<p>
Mesa 17.1.9 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
4325401b07b5f44759da781bc8d7c0a4a7244e09a702d16c037090986e07ee22 mesa-17.1.9.tar.gz
5f51ad94341696097d5df7b838183534478216858ac0fc8de183671a36ffea1a mesa-17.1.9.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>
</ul>
<h2>Changes</h2>
<p>Andres Gomez (8):</p>
<ul>
<li>docs: add sha256 checksums for 17.1.8</li>
<li>cherry-ignore: added 17.2 nominations.</li>
<li>cherry-ignore: add "nir: Fix system_value_from_intrinsic for subgroups"</li>
<li>cherry-ignore: add "i965: Fix crash in fallback GTT mapping."</li>
<li>cherry-ignore: add "radeonsi/gfx9: always flush DB metadata on framebuffer changes"</li>
<li>cherry-ignore: add "radv: Fix vkCopyImage with both depth and stencil aspects."</li>
<li>cherry-ignore: add "radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug"</li>
<li>Update version to 17.1.9</li>
</ul>
<p>Bas Nieuwenhuizen (3):</p>
<ul>
<li>radv: Fix off by one in MAX_VBS assert.</li>
<li>radv: Fix sparse BO mapping merging.</li>
<li>radv: Actually set the cmd_buffer usage_flags.</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load</li>
</ul>
<p>Charmaine Lee (1):</p>
<ul>
<li>vbo: fix offset in minmax cache key</li>
</ul>
<p>Christian Gmeiner (1):</p>
<ul>
<li>etnaviv: use correct param for etna_compatible_rs_format(..)</li>
</ul>
<p>Emil Velikov (3):</p>
<ul>
<li>egl: don't NULL deref the .get_capabilities function pointer</li>
<li>egl/wayland: plug leaks in dri2_wl_create_window_surface() error path</li>
<li>egl/wayland: polish object teardown in dri2_wl_destroy_surface</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>util: improve compiler guard</li>
</ul>
<p>Grazvydas Ignotas (2):</p>
<ul>
<li>radv: clear dynamic_shader_stages on create</li>
<li>radv: don't assert on empty hash table</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>glsl: fix counting of vertex shader output slots used by explicit vars</li>
<li>st/mesa: fix handling of vertex array double inputs</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>anv/formats: Nicely handle unknown VkFormat enums</li>
<li>spirv: Add support for the HelperInvocation builtin</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvc0: write 0 to pipeline_statistics.cs_invocations</li>
</ul>
<p>Michael Olbrich (1):</p>
<ul>
<li>egl/dri2: only destroy created objects</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gallivm: correct channel shift logic on big endian</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>st/mesa: fix view template initialization in try_pbo_readpixels</li>
</ul>
</div>
</body>
</html>

218
docs/relnotes/17.2.0.html Normal file
View File

@@ -0,0 +1,218 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.0 Release Notes / September 4, 2017</h1>
<p>
Mesa 17.2.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 17.2.1.
</p>
<p>
Mesa 17.2.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
9484ad96b4bb6cda5bbf1aef52dfa35183dc21aa6258a2991c245996c2fdaf85 mesa-17.2.0.tar.gz
3123448f770eae58bc73e15480e78909defb892f10ab777e9116c9b218094943 mesa-17.2.0.tar.xz
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>GL_ARB_bindless_texture on radeonsi</li>
<li>GL_ARB_post_depth_coverage on nvc0 (GM200+)</li>
<li>GL_ARB_shader_ballot on i965/gen8+</li>
<li>GL_ARB_shader_group_vote on i965 (with a no-op vec4 implementation)</li>
<li>GL_ARB_shader_viewport_layer_array on nvc0 (GM200+)</li>
<li>GL_AMD_vertex_shader_layer on nvc0 (GM200+)</li>
<li>GL_AMD_vertex_shader_viewport_index on nvc0 (GM200+)</li>
</ul>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=68365">Bug 68365</a> - [SNB Bisected]Piglit spec_ARB_framebuffer_object_fbo-blit-stretch fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=77240">Bug 77240</a> - khrplatform.h not installed if EGL is disabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=95530">Bug 95530</a> - Stellaris - colored overlay of sectors doesn't render on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96449">Bug 96449</a> - Dying Light reports OpenGL version 3.0 with mesa-git</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=96958">Bug 96958</a> - [SKL] Improper rendering in Europa Universalis IV</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97524">Bug 97524</a> - Samplers referring to the same texture unit with different types should raise GL_INVALID_OPERATION</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=97957">Bug 97957</a> - Awful screen tearing in a separate X server with DRI3</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98238">Bug 98238</a> - Witcher 2: objects are black when changing lod on Radeon Pitcairn</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98428">Bug 98428</a> - Undefined non-weak-symbol in dri-drivers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=98833">Bug 98833</a> - [REGRESSION, bisected] Wayland revert commit breaks non-Vsync fullscreen frame updates</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=99467">Bug 99467</a> - [radv] DOOM 2016 + wine. Green screen everywhere (but can be started)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100070">Bug 100070</a> - Rocket League: grass gets rendered incorrectly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100242">Bug 100242</a> - radeon buffer allocation failure during startup of Factorio</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100620">Bug 100620</a> - [SKL] 48-bit addresses break DOOM</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100690">Bug 100690</a> - [Regression, bisected] TotalWar: Warhammer corrupted graphics</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100741">Bug 100741</a> - Chromium - Memory leak</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100785">Bug 100785</a> - [regression, bisected] arb_gpu_shader5 piglit fail</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100854">Bug 100854</a> - YUV to RGB Color Space Conversion result is not precise</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100871">Bug 100871</a> - gles cts hangs mesa indefinitely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100877">Bug 100877</a> - vulkan/tests/block_pool_no_free regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100892">Bug 100892</a> - Polaris 12: winsys init bad switch (missing break) initializing addrlib</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100925">Bug 100925</a> - [HSW/BSW/BDW/SKL] Google Earth is not resolving all the details in the map correctly</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100937">Bug 100937</a> - Mesa fails to build with GCC 4.8</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100945">Bug 100945</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100988">Bug 100988</a> - glXGetCurrentDisplay() no longer works for FakeGLX contexts?</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101071">Bug 101071</a> - compiling glsl fails with undefined reference to `pthread_create'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101088">Bug 101088</a> - `gallium: remove pipe_index_buffer and set_index_buffer` causes glitches and crash in gallium nine</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101110">Bug 101110</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101189">Bug 101189</a> - Latest git fails to compile with radeon</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101252">Bug 101252</a> - eglGetDisplay() is not thread safe</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101254">Bug 101254</a> - VDPAU videos don't start playing with r600 gallium driver</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101283">Bug 101283</a> - skylake: page fault accessing address 0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101284">Bug 101284</a> - [G45] ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101294">Bug 101294</a> - radeonsi minecraft forge splash freeze since 17.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101306">Bug 101306</a> - [BXT] gles asserts in cts</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101326">Bug 101326</a> - gallium/wgl: Allow context creation without prior SetPixelFormat()</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101334">Bug 101334</a> - AMD SI cards: Some vulkan apps freeze the system</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101336">Bug 101336</a> - glcpp-test.sh regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101340">Bug 101340</a> - i915_surface.c:108:4: error: too few arguments to function util_blitter_default_src_texture</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101360">Bug 101360</a> - Assertion failure comparing result of ballotARB</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101401">Bug 101401</a> - [REGRESSION][BISECTED] GDM fails to start after 8ec4975cd83365c791a1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101418">Bug 101418</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101451">Bug 101451</a> - [G33] ES2-CTS.functional.clipping.polygon regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101464">Bug 101464</a> - PrimitiveRestartNV inside a render list causes a crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101471">Bug 101471</a> - Mesa fails to build: unknown typename bool</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101535">Bug 101535</a> - [bisected] [Skylake] Kwin won't start and glxgears coredumps</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101538">Bug 101538</a> - From &quot;Use isl for hiz layouts&quot; commit onwards, everything crashes with Mesa</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101539">Bug 101539</a> - [Regresion] [IVB] Segment fault in recent commit in intel_miptree_level_has_hiz under Ivy bridge</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101558">Bug 101558</a> - [regression][bisected] MPV playing video via opengl &quot;randomly&quot; results in only part of the window / screen being rendered with Mesa GIT.</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101596">Bug 101596</a> - Blender renders black UI elements</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101607">Bug 101607</a> - Regression in anisotropic filtering from &quot;i965: Convert fs sampler state to use genxml&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101657">Bug 101657</a> - strtod.c:32:10: fatal error: xlocale.h: No such file or directory</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101666">Bug 101666</a> - bitfieldExtract is marked as a built-in function on OpenGL ES 3.0, but was added in OpenGL ES 3.1</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101683">Bug 101683</a> - Some games hang while loading when compositing is shut off or absent</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101703">Bug 101703</a> - No stencil buffer allocated when requested by GLUT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101704">Bug 101704</a> - [regression][bisected] glReadPixels() from pbuffer failing in Android CTS camera tests</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101766">Bug 101766</a> - Assertion `!&quot;invalid type&quot;' failed when constant expression involves literal of different type</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101774">Bug 101774</a> - gen_clflush.h:37:7: error: implicit declaration of function __builtin_ia32_clflush</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101775">Bug 101775</a> - Xorg segfault since 147d7fb &quot;st/mesa: add a winsys buffers list in st_context&quot;</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101829">Bug 101829</a> - read-after-free in st_framebuffer_validate</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101831">Bug 101831</a> - Build failure in GNOME Continuous</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101851">Bug 101851</a> - [regression] libEGL_common.a undefined reference to '__gxx_personality_v0'</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101867">Bug 101867</a> - Launch options window renders black in Feral Games in current Mesa trunk</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101876">Bug 101876</a> - SIGSEGV when launching Steam</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101910">Bug 101910</a> - [BYT] ES31-CTS.functional.copy_image.non_compressed.viewclass_96_bits.rgb32f_rgb32f</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101925">Bug 101925</a> - playstore/webview crash</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101961">Bug 101961</a> - Serious Sam Fusion hangs system completely</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101982">Bug 101982</a> - Weston crashes when running an OpenGL program on i965</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101983">Bug 101983</a> - [G33] ES2-CTS.functional.shaders.struct.uniform.sampler_nested* regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102024">Bug 102024</a> - FORMAT_FEATURE_SAMPLED_IMAGE_BIT not supported for D16_UNORM and D32_SFLOAT</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102148">Bug 102148</a> - Crash when running qopenglwidget example on mesa llvmpipe win32</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102241">Bug 102241</a> - gallium/wgl: SwapBuffers freezing regularly with swap interval enabled</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102308">Bug 102308</a> - segfault in glCompressedTextureSubImage3D</li>
</ul>
<h2>Changes</h2>
<ul>
<li>GL_APPLE_vertex_array_object support removed.</li>
</ul>
</div>
</body>
</html>

200
docs/relnotes/17.2.1.html Normal file
View File

@@ -0,0 +1,200 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.1 Release Notes / September 17, 2017</h1>
<p>
Mesa 17.2.1 is a bug fix release which fixes bugs found since the 17.2.0 release.
</p>
<p>
Mesa 17.2.1 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
c902d8dc2540195bc570d88af1a8fd8a1774373660a27bb1d539551f46824bc1 mesa-17.2.1.tar.gz
77385d17827cff24a3bae134342234f2efe7f7f990e778109682571dbbc9ba1e mesa-17.2.1.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=100613">Bug 100613</a> - Regression in Mesa 17 on s390x (zSystems)</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101709">Bug 101709</a> - [llvmpipe] piglit gl-1.0-scissor-offscreen regression</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102454">Bug 102454</a> - glibc 2.26 doesn't provide anymore xlocale.h</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102467">Bug 102467</a> - src/mesa/state_tracker/st_cb_readpixels.c:178]: (warning) Redundant assignment</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102502">Bug 102502</a> - [bisected] Kodi crashes since commit 707d2e8b - gallium: fold u_trim_pipe_prim call from st/mesa to drivers</li>
</ul>
<h2>Changes</h2>
<p>Bas Nieuwenhuizen (4):</p>
<ul>
<li>radv: Actually set the cmd_buffer usage_flags.</li>
<li>radv: Fix vkCopyImage with both depth and stencil aspects.</li>
<li>radv: Disable multilayer &amp; multilevel DCC.</li>
<li>radv: Don't allocate CMASK for linear images.</li>
</ul>
<p>Ben Crocker (1):</p>
<ul>
<li>llvmpipe: lp_build_gather_elem_vec BE fix for 3x16 load</li>
</ul>
<p>Brian Paul (1):</p>
<ul>
<li>llvmpipe: initialize llvmpipe-&gt;dirty with LP_NEW_SCISSOR</li>
</ul>
<p>Charmaine Lee (1):</p>
<ul>
<li>vbo: fix offset in minmax cache key</li>
</ul>
<p>Dave Airlie (12):</p>
<ul>
<li>radv: disable 1d/2d linear optimisation on gfx9.</li>
<li>radv/gfx9: set descriptor up for base_mip to level range.</li>
<li>Revert "radv: disable support for VEGA for now."</li>
<li>radv/winsys: use amdgpu_bo_va_op_raw.</li>
<li>radv/gfx9: allocate events from uncached VA space</li>
<li>radv: use simpler indirect packet 3 if possible.</li>
<li>radv: don't use iview for meta image width/height.</li>
<li>radv: handle GFX9 1D textures</li>
<li>radv/gfx9: set mip0-depth correctly for 2d arrays/3d images</li>
<li>radv/ac: bump params array for image atomic comp swap</li>
<li>radv/gfx9: fix image resource handling.</li>
<li>radv/winsys: fix flags vs va_flags thinko.</li>
</ul>
<p>Emil Velikov (7):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.0</li>
<li>cherry-ignore: add getCapability patches</li>
<li>cherry-ignore: ignore gfx9 tile swizzle fix</li>
<li>cherry-ignore: add execution_type() fix to the list</li>
<li>cherry-ignore: add EGL+gbm swast patches</li>
<li>egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension</li>
<li>Update version to 17.2.1</li>
</ul>
<p>Eric Engestrom (3):</p>
<ul>
<li>util: improve compiler guard</li>
<li>mesa/st: remove unwanted backup file</li>
<li>docs/egl: remove reference to EGL_DRIVERS_PATH</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>radv: don't assert on empty hash table</li>
</ul>
<p>Jason Ekstrand (2):</p>
<ul>
<li>anv/formats: Nicely handle unknown VkFormat enums</li>
<li>spirv: Add support for the HelperInvocation builtin</li>
</ul>
<p>Karol Herbst (1):</p>
<ul>
<li>nvc0: write 0 to pipeline_statistics.cs_invocations</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965: Fix crash in fallback GTT mapping.</li>
<li>i965: Set "Subslice Hashing Mode" to 16x16 on Apollolake.</li>
</ul>
<p>Marek Olšák (1):</p>
<ul>
<li>st/mesa: skip draw calls with pipe_draw_info::count == 0</li>
</ul>
<p>Michael Olbrich (1):</p>
<ul>
<li>egl/dri2: only destroy created objects</li>
</ul>
<p>Nicolai Hähnle (1):</p>
<ul>
<li>radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog</li>
</ul>
<p>Nicolai Hähnle (4):</p>
<ul>
<li>radeonsi/gfx9: always flush DB metadata on framebuffer changes</li>
<li>st/glsl_to_tgsi: only the first (inner-most) array reference can be a 2D index</li>
<li>ac/surface: match Z and stencil tile config</li>
<li>glsl: fix glsl_struct_field size calculations for shader cache</li>
</ul>
<p>Ray Strode (1):</p>
<ul>
<li>gallivm: correct channel shift logic on big endian</li>
</ul>
<p>Rob Clark (1):</p>
<ul>
<li>freedreno: skip batch-cache for compute shaders</li>
</ul>
<p>Roland Scheidegger (1):</p>
<ul>
<li>st/mesa: fix view template initialization in try_pbo_readpixels</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radeonsi: update dirty_level_mask before dispatching</li>
</ul>
<p>Timothy Arceri (9):</p>
<ul>
<li>glsl: allow NULL to be passed to encode_type_to_blob()</li>
<li>glsl: stop adding pointers from gl_shader_variable to the cache</li>
<li>glsl: stop adding pointers from glsl_struct_field to the cache</li>
<li>glsl: add has_uniform_storage() helper to shader cache</li>
<li>glsl: don't write uniform storage offset if there isn't one</li>
<li>glsl: always write a name/label string to the cache</li>
<li>compiler: move pointers to the start of shader_info</li>
<li>glsl: stop adding pointers from shader_info to the cache</li>
<li>glsl: stop adding pointers from bindless structs to the cache</li>
</ul>
</div>
</body>
</html>

203
docs/relnotes/17.2.2.html Normal file
View File

@@ -0,0 +1,203 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.2 Release Notes / October 2, 2017</h1>
<p>
Mesa 17.2.2 is a bug fix release which fixes bugs found since the 17.2.1 release.
</p>
<p>
Mesa 17.2.2 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
8242256f3243ed3f35184ed7bf0a9070439ccdf477a3bd9cfd2437c0b2f9bc7f mesa-17.2.2.tar.gz
cf522244d6a5a1ecde3fc00e7c96935253fe22f808f064cab98be6f3faa65782 mesa-17.2.2.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102573">Bug 102573</a> - fails to build on armel</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102844">Bug 102844</a> - memory leak with glDeleteProgram for shader program type GL_COMPUTE_SHADER</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102847">Bug 102847</a> - swr fail to build with llvm-5.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102904">Bug 102904</a> - piglit and gl45 cts linker tests regressed</li>
</ul>
<h2>Changes</h2>
<p>Alexandru-Liviu Prodea (1):</p>
<ul>
<li>Scons: Add LLVM 5.0 support</li>
</ul>
<p>Bas Nieuwenhuizen (1):</p>
<ul>
<li>radv: Check for GFX9 for 1D arrays in image_size intrinsic.</li>
</ul>
<p>Boris Brezillon (1):</p>
<ul>
<li>broadcom/vc4: Fix infinite retry in vc4_bo_alloc()</li>
</ul>
<p>Dave Airlie (3):</p>
<ul>
<li>radv/nir: call opt_remove_phis after trivial continues.</li>
<li>ac/surface: handle S8 on gfx9</li>
<li>st/glsl-&gt;tgsi: fix u64 to bool comparisons.</li>
</ul>
<p>David Airlie (1):</p>
<ul>
<li>radv: add gfx9 scissor workaround</li>
</ul>
<p>Emil Velikov (2):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.1</li>
<li>automake: enable libunwind in `make distcheck'</li>
</ul>
<p>Eric Anholt (4):</p>
<ul>
<li>broadcom/vc4: Fix use-after-free for flushing when writing to a texture.</li>
<li>broadcom/vc4: Fix use-after-free trying to mix a quad and tile clear.</li>
<li>broadcom/vc4: Fix use-after-free when deleting a program.</li>
<li>broadcom/vc4: Keep pipe_sampler_view-&gt;texture matching the original texture.</li>
</ul>
<p>Gert Wollny (2):</p>
<ul>
<li>travis: force llvm-3.3 for "make Gallium ST Other"</li>
<li>travis: Add libunwind-dev to gallium/make builds</li>
</ul>
<p>Grazvydas Ignotas (1):</p>
<ul>
<li>configure: check if -latomic is needed for __atomic_*</li>
</ul>
<p>Ian Romanick (1):</p>
<ul>
<li>nv20: Fix GL_CLAMP</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>i965/blorp: Set r8stencil_needs_update when writing stencil</li>
<li>vulkan/wsi/wayland: Stop printing out the DRM device</li>
<li>vulkan/wsi/wayland: Refactor wsi_wl_display code</li>
<li>vulkan/wsi/wayland: Stop caching Wayland displays</li>
<li>vulkan/wsi/wayland: Copy wl_proxy objects from oldSwapchain if available</li>
<li>vulkan/wsi/wayland: Return better error messages</li>
</ul>
<p>Juan A. Suarez Romero (4):</p>
<ul>
<li>cherry-ignore: add "radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug"</li>
<li>cherry-ignore: add "radv: Check for GFX9 for 1D arrays in image_size intrinsic."</li>
<li>cherry-ignore: add "radv: copy the number of viewports/scissors at pipeline bind time"</li>
<li>Update version to 17.2.2</li>
</ul>
<p>Józef Kucia (1):</p>
<ul>
<li>anv: Fix descriptors copying</li>
</ul>
<p>Kenneth Graunke (2):</p>
<ul>
<li>i965/vec4: Actually handle atomic op intrinsics.</li>
<li>i965/vec4: Fix swizzles on atomic sources.</li>
</ul>
<p>Leo Liu (1):</p>
<ul>
<li>st/va/postproc: use video original size for postprocessing</li>
</ul>
<p>Lucas Stach (1):</p>
<ul>
<li>etnaviv: fix 16bpp clears</li>
</ul>
<p>Matt Turner (2):</p>
<ul>
<li>util: Link libmesautil into u_atomic_test</li>
<li>util/u_atomic: Add implementation of __sync_val_compare_and_swap_8</li>
</ul>
<p>Nicolai Hähnle (9):</p>
<ul>
<li>radeonsi: workaround for gather4 on integer cube maps</li>
<li>amd/common: round cube array slice in ac_prepare_cube_coords</li>
<li>amd/common: add workaround for cube map array layer clamping</li>
<li>glsl/linker: fix output variable overlap check</li>
<li>radeonsi: fix array textures layer coordinate</li>
<li>radeonsi: set MIP_POINT_PRECLAMP to 0</li>
<li>amd/addrlib: fix missing va_end() after va_copy()</li>
<li>amd/common: move ac_build_phi from radeonsi</li>
<li>radeonsi: fix a regression in integer cube map handling</li>
</ul>
<p>Samuel Iglesias Gonsálvez (1):</p>
<ul>
<li>anv: fix viewport transformation for z component</li>
</ul>
<p>Samuel Pitoiset (1):</p>
<ul>
<li>radv: fix saved compute state when doing statistics/occlusion queries</li>
</ul>
<p>Tapani Pälli (1):</p>
<ul>
<li>mesa: free current ComputeProgram state in _mesa_free_context_data</li>
</ul>
<p>Tim Rowley (1):</p>
<ul>
<li>swr/rast: remove llvm fence/atomics from generated files</li>
</ul>
<p>Tomasz Figa (1):</p>
<ul>
<li>egl/dri2: Implement swapInterval fallback in a conformant way</li>
</ul>
</div>
</body>
</html>

181
docs/relnotes/17.2.3.html Normal file
View File

@@ -0,0 +1,181 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.2.3 Release Notes / October 19, 2017</h1>
<p>
Mesa 17.2.3 is a bug fix release which fixes bugs found since the 17.2.2 release.
</p>
<p>
Mesa 17.2.3 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
fb305eecfeec1fd771fdc96fff973c51871f7bd35fd2bd56cacc27b4b8823220 mesa-17.2.3.tar.gz
a0b0ec8f7b24dd044d7ab30a8c7e6d3767521e245f88d4ed5dd93315dc56f837 mesa-17.2.3.tar.xz
</pre>
<h2>New features</h2>
<p>None</p>
<h2>Bug fixes</h2>
<ul>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=101832">Bug 101832</a> - [PATCH][regression][bisect] Xorg fails to start after f50aa21456d82c8cb6fbaa565835f1acc1720a5d</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102852">Bug 102852</a> - Scons: Support the new Scons 3.0.0</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=102940">Bug 102940</a> - Regression: Vulkan KMS rendering crashes since 17.2</li>
</ul>
<h2>Changes</h2>
<p>Alex Smith (1):</p>
<ul>
<li>radv: Add R16G16B16A16_SNORM fast clear support</li>
</ul>
<p>Bas Nieuwenhuizen (2):</p>
<ul>
<li>nir/spirv: Allow loop breaks in a switch body.</li>
<li>radv: Only set the MTYPE flags on GFX9+.</li>
</ul>
<p>Ben Crocker (4):</p>
<ul>
<li>gallivm: fix typo in debug_printf message</li>
<li>gallivm: allow additional llc options</li>
<li>gallivm/ppc64le: adjust VSX code generation control.</li>
<li>gallivm/ppc64le: allow environmental control of Altivec code generation</li>
</ul>
<p>Daniel Stone (2):</p>
<ul>
<li>egl/wayland: Check queryImage return for wl_buffer</li>
<li>egl/wayland: Don't use dmabuf with no modifiers</li>
</ul>
<p>Dave Airlie (2):</p>
<ul>
<li>radv: emit fmuladd instead of fma to llvm.</li>
<li>radv: lower ffma in nir.</li>
</ul>
<p>Emil Velikov (6):</p>
<ul>
<li>cherry-ignore: add "anv: Remove unreachable cases from isl_format_for_size"</li>
<li>cherry-ignore: add "anv/wsi: Allocate enough memory for the entire image"</li>
<li>swr/rast: do not crash on NULL strings returned by getenv</li>
<li>wayland-drm: use a copy of the wayland_drm_callbacks struct</li>
<li>eglmesaext: add forward declaration for struct wl_buffers</li>
<li>Update version to 17.2.3</li>
</ul>
<p>Eric Engestrom (1):</p>
<ul>
<li>scons: use python3-compatible print()</li>
</ul>
<p>Ilia Mirkin (2):</p>
<ul>
<li>nv50/ir: fix 64-bit integer shifts</li>
<li>nv50,nvc0: fix push hint logic in presence of a start offset</li>
</ul>
<p>Jason Ekstrand (6):</p>
<ul>
<li>intel/compiler: Don't cmod propagate into a saturated operation</li>
<li>intel/compiler: Don't propagate cmod into integer multiplies</li>
<li>glsl/blob: Return false from ensure_can_read on overrun</li>
<li>glsl/blob: Return false from grow_to_fit if we've ever failed</li>
<li>nir/opcodes: Fix constant-folding of ufind_msb</li>
<li>nir: Get rid of the variable on vote intrinsics</li>
</ul>
<p>Juan A. Suarez Romero (1):</p>
<ul>
<li>docs: add sha256 checksums for 17.2.2</li>
</ul>
<p>Józef Kucia (3):</p>
<ul>
<li>anv: Fix vkCmdFillBuffer()</li>
<li>spirv: Fix SpvOpAtomicISub</li>
<li>anv: Do not assert() on VK_ATTACHMENT_UNUSED</li>
</ul>
<p>Leo Liu (3):</p>
<ul>
<li>st/va: use pipe transfer_map to map upload buffer</li>
<li>st/vdpau: don't re-allocate interlaced buffer with packed YUV format</li>
<li>st/va: don't re-allocate interlaced buffer with pakced format</li>
</ul>
<p>Lionel Landwerlin (4):</p>
<ul>
<li>intel: compiler: vec4: add missing default 0 lod</li>
<li>anv/cmd_buffer: fix push descriptors with set &gt; 0</li>
<li>anv/cmd_buffer: Reset state in cmd_buffer_destroy</li>
<li>anv: bo_cache: allow importing a BO larger than needed</li>
</ul>
<p>Marek Olšák (3):</p>
<ul>
<li>mesa: fix texture updates for ATI_fragment_shader</li>
<li>st/mesa: don't use pipe_surface for passing information about EGLImage</li>
<li>glsl_to_tgsi: fix instruction order for bindless textures</li>
</ul>
<p>Nicolai Hähnle (14):</p>
<ul>
<li>st/glsl_to_tgsi: fix conditional assignments to packed shader outputs</li>
<li>amd/common: fix build_cube_select</li>
<li>radeonsi/gfx9: fix geometry shaders without output vertices</li>
<li>util/queue: fix a race condition in the fence code</li>
<li>glsl/lower_instruction: handle denorms and overflow in ldexp correctly</li>
<li>radeonsi: move current_rast_prim to r600_common_context</li>
<li>radeonsi: don't discard points and lines</li>
<li>radeonsi: deduce rast_prim correctly for tessellation point mode</li>
<li>radeonsi: fix maximum advertised point size / line width</li>
<li>st/mesa: don't clobber glGetInternalformat* buffer for GL_NUM_SAMPLE_COUNTS</li>
<li>st/glsl_to_tgsi: fix indirect access to 64-bit integer</li>
<li>st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts</li>
<li>radeonsi: clamp depth comparison value only for fixed point formats</li>
<li>radeonsi: clamp border colors for upgraded depth textures</li>
</ul>
<p>Rob Clark (2):</p>
<ul>
<li>freedreno/a5xx: align height to GMEM</li>
<li>freedreno/a5xx: fix missing restore state</li>
</ul>
</div>
</body>
</html>

72
docs/relnotes/17.3.0.html Normal file
View File

@@ -0,0 +1,72 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Mesa Release Notes</title>
<link rel="stylesheet" type="text/css" href="../mesa.css">
</head>
<body>
<div class="header">
<h1>The Mesa 3D Graphics Library</h1>
</div>
<iframe src="../contents.html"></iframe>
<div class="content">
<h1>Mesa 17.3.0 Release Notes / TBD</h1>
<p>
Mesa 17.3.0 is a new development release.
People who are concerned with stability and reliability should stick
with a previous release or wait for Mesa 17.3.1.
</p>
<p>
Mesa 17.3.0 implements the OpenGL 4.5 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.5. OpenGL
4.5 is <strong>only</strong> available if requested at context creation
because compatibility contexts are not supported.
</p>
<h2>SHA256 checksums</h2>
<pre>
TBD.
</pre>
<h2>New features</h2>
<p>
Note: some of the new features are only available with certain drivers.
</p>
<ul>
<li>libtxc_dxtn is now integrated into Mesa. GL_EXT_texture_compression_s3tc and GL_ANGLE_texture_compression_dxt are now always enabled on drivers that support them</li>
<li>GL_ARB_indirect_parameters on i965/gen7+</li>
<li>GL_ARB_polygon_offset_clamp on i965, nv50, nvc0, r600, radeonsi, llvmpipe, swr</li>
<li>GL_ARB_transform_feedback_overflow_query on radeonsi</li>
<li>GL_ARB_texture_filter_anisotropic on i965, nv50, nvc0, r600, radeonsi</li>
<li>GL_EXT_memory_object on radeonsi</li>
<li>GL_EXT_memory_object_fd on radeonsi</li>
<li>EGL_ANDROID_native_fence_sync on radeonsi with a future kernel (possibly 4.15)</li>
<li>EGL_IMG_context_priority on i965</li>
</ul>
<h2>Bug fixes</h2>
<ul>
TBD
</ul>
<h2>Changes</h2>
<ul>
TBD
</ul>
</div>
</body>
</html>

View File

@@ -57,7 +57,7 @@ copy texturing).
<li>New Intel i965 DRI driver
<li>New <code>minstall</code> script to replace normal install program
<li>Faster fragment program execution in software
<li>Added (or fixed) support for <a href="http://www.opengl.org/registry/specs/SGI/make_current_read.txt">
<li>Added (or fixed) support for <a href="https://www.khronos.org/registry/OpenGL/extensions/SGI/GLX_SGI_make_current_read.txt">
GLX_SGI_make_current_read</a> to the following drivers:
<ul>
<li>radeon</li>

View File

@@ -226,7 +226,7 @@ did not exist in the 7.10 release series at all.</p>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36086">Bug 36086</a> - [wine] Segfault r300_resource_copy_region with some wine apps and RADEON_HYPERZ</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36182">Bug 36182</a> - Game Trine from http://www.humblebundle.com/ needs ATI_draw_buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36182">Bug 36182</a> - Game Trine from https://www.humblebundle.com/ needs ATI_draw_buffers</li>
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=36268">Bug 36268</a> - [r300g, bisected] minor flickering in Unigine Sanctuary</li>

View File

@@ -21,7 +21,7 @@ Mesa 7.5.1 is a bug-fix release fixing issues found since the 7.5 release.
</p>
<p>
The main new feature of Mesa 7.5.x is the
<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
</p>
<p>
Mesa 7.5.1 implements the OpenGL 2.1 API, but the version reported by

View File

@@ -21,7 +21,7 @@ Mesa 7.5.2 is a bug-fix release fixing issues found since the 7.5.1 release.
</p>
<p>
The main new feature of Mesa 7.5.x is the
<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
</p>
<p>
Mesa 7.5.2 implements the OpenGL 2.1 API, but the version reported by

View File

@@ -23,7 +23,7 @@ with the 7.4.x branch or wait for Mesa 7.5.1.
</p>
<p>
The main new feature of Mesa 7.5 is the
<a href="http://wiki.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
<a href="https://www.freedesktop.org/wiki/Software/gallium">Gallium3D</a> infrastructure.
</p>
<p>
Mesa 7.5 implements the OpenGL 2.1 API, but the version reported by

View File

@@ -90,7 +90,7 @@ The two supported build methods are now autoconf/automake and SCons.
<li>Removed support for GL_ARB_shadow_ambient extension</li>
<li>Removed Gallium3D - nvfx driver (use nv30 instead)</li>
<li>
libGLU has been moved into its own repository, found at <a href="http://cgit.freedesktop.org/mesa/glu/">http://cgit.freedesktop.org/mesa/glu/</a>
libGLU has been moved into its own repository, found at <a href="https://cgit.freedesktop.org/mesa/glu/">https://cgit.freedesktop.org/mesa/glu/</a>
</li>
</ul>

View File

@@ -68,9 +68,9 @@ b1ae5a4d9255953980bc9254f5323420 MesaLib-9.1.2.zip
<li><a href="https://bugs.freedesktop.org/show_bug.cgi?id=62434">Bug 62434</a> - [bisected] 3284.073] (EE) AIGLX error: dlopen of /usr/lib/xorg/modules/dri/r600_dri.so failed (/usr/lib/libllvmradeon9.2.0.so: undefined symbol: lp_build_tgsi_intrinsic)</li>
<li><a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=349437">Debian bug #349437</a> - mesa - FTBFS: error: 'IEEE_ONE' undeclared</li>
<li><a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=349437">Debian bug #349437</a> - mesa - FTBFS: error: 'IEEE_ONE' undeclared</li>
<li><a href="http://bugzilla.redhat.com/show_bug.cgi?id=918661">Redhat bug #918661</a> - crash in routine Avogadro UI manipulation</li>
<li><a href="https://bugzilla.redhat.com/show_bug.cgi?id=918661">Redhat bug #918661</a> - crash in routine Avogadro UI manipulation</li>
</ul>

View File

@@ -17,13 +17,13 @@
<h1>Code Repository</h1>
<p>
Mesa uses <a href="http://git-scm.com">git</a>
Mesa uses <a href="https://git-scm.com">git</a>
as its source code management system.
</p>
<p>
The master git repository is hosted on
<a href="http://www.freedesktop.org">freedesktop.org</a>.
<a href="https://www.freedesktop.org">freedesktop.org</a>.
</p>
<p>
@@ -35,9 +35,9 @@ You may access the repository either as an
<p>
You may also
<a href="http://cgit.freedesktop.org/mesa/mesa/"
<a href="https://cgit.freedesktop.org/mesa/mesa/"
>browse the main Mesa git repository</a> and the
<a href="http://cgit.freedesktop.org/mesa/demos"
<a href="https://cgit.freedesktop.org/mesa/demos"
>Mesa demos and tests git repository</a>.
</p>
@@ -73,7 +73,7 @@ follow this procedure:
</p>
<ol>
<li>Subscribe to the
<a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">mesa-dev</a>
mailing list.
<li>Start contributing to the project by
<a href="submittingpatches.html" target="_parent">submitting patches</a> to
@@ -92,7 +92,7 @@ only if they're being supervised by another Mesa developer at the same
organization and planning to work in a limited area of the code or on a
separate branch.
<li>To apply for an account, follow
<a href="http://www.freedesktop.org/wiki/AccountRequests">these directions</a>.
<a href="https://www.freedesktop.org/wiki/AccountRequests">these directions</a>.
It's also appreciated if you briefly describe what you intend to do (work
on a particular driver, add a new extension, etc.) in the bugzilla record.
</ol>
@@ -121,7 +121,7 @@ Once your account is established:
<h2>Windows Users</h2>
<p>
If you're <a href="http://git.wiki.kernel.org/index.php/WindowsInstall">
If you're <a href="https://git.wiki.kernel.org/index.php/WindowsInstall">
using git on Windows</a> you'll want to enable automatic CR/LF conversion in
your local copy of the repository:
</p>
@@ -144,7 +144,7 @@ Unix users don't need to set this option.
<p>
At any given time, there may be several active branches in Mesa's
repository.
Generally, the trunk contains the latest development (unstable)
Generally, <tt>master</tt> contains the latest development (unstable)
code while a branch has the latest stable code.
</p>
@@ -235,7 +235,7 @@ If you want the rebase action to be the default action, then
git config --global branch.autosetuprebase=always
</pre>
<p>
See <a href="http://www.eecs.harvard.edu/~cduan/technical/git/">Understanding Git Conceptually</a> for a fairly clear explanation about all of this.
See <a href="https://www.eecs.harvard.edu/~cduan/technical/git/">Understanding Git Conceptually</a> for a fairly clear explanation about all of this.
</p>
</ol>

View File

@@ -18,7 +18,7 @@
<p>
This page describes the features and status of Mesa's support for the
<a href="http://opengl.org/documentation/glsl/">
<a href="https://opengl.org/documentation/glsl/">
OpenGL Shading Language</a>.
</p>
@@ -49,8 +49,9 @@ execution. These are generally used for debugging.
<li><b>log</b> - log all GLSL shaders to files.
The filenames will be "shader_X.vert" or "shader_X.frag" where X
the shader ID.
<li><b>nopt</b> - disable compiler optimizations
<li><b>opt</b> - force compiler optimizations
<li><b>cache_info</b> - print debug information about shader cache
<li><b>cache_fb</b> - force cached shaders to be ignored and do a full
recompile via the fallback path</li>
<li><b>uniform</b> - print message to stdout when glUniform is called
<li><b>nopvert</b> - force vertex shaders to be a simple shader that just transforms
the vertex position with ftransform() and passes through the color and
@@ -63,9 +64,9 @@ execution. These are generally used for debugging.
Example: export MESA_GLSL=dump,nopt
</p>
<h3 id="replacement">Experimenting with Shader Replacements</h3>
<p>
Shaders can be dumped and replaced on runtime for debugging purposes. Mesa
needs to be configured with '--with-sha1' to enable this functionality. This
Shaders can be dumped and replaced on runtime for debugging purposes. This
feature is not currently supported by SCons build.
This is controlled via following environment variables:
@@ -75,7 +76,22 @@ This is controlled via following environment variables:
</ul>
Note, path set must exist before running for dumping or replacing to work.
When both are set, these paths should be different so the dumped shaders do
not clobber the replacement shaders.
not clobber the replacement shaders. Also, the filenames of the replacement shaders
should match the filenames of the corresponding dumped shaders.
</p>
<h3 id="capture">Capturing Shaders</h3>
<p>
Setting <b>MESA_SHADER_CAPTURE_PATH</b> to a directory will cause the compiler
to write <tt>.shader_test</tt> files for use with
<a href="https://cgit.freedesktop.org/mesa/shader-db">shader-db</a>, a tool
which compiler developers can use to gather statistics about shaders
(instructions, cycles, memory accesses, and so on).
</p>
<p>
Notably, this captures linked GLSL shaders - with all stages together -
as well as ARB programs.
</p>
<h2 id="support">GLSL Version</h2>
@@ -221,7 +237,7 @@ regressions.
</p>
<p>
The <a href="http://piglit.freedesktop.org/">Piglit</a> project
The <a href="https://piglit.freedesktop.org/">Piglit</a> project
has many GLSL tests.
</p>

View File

@@ -31,7 +31,7 @@ the <code>doxygen</code> directory and run <code>make</code>.
<p>
For an example of Doxygen usage in Mesa, see a recent source file
such as <a href="http://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>.
such as <a href="https://cgit.freedesktop.org/mesa/mesa/tree/src/mesa/main/bufferobj.c">bufferobj.c</a>.
</p>
@@ -41,6 +41,11 @@ run the doxygen scripts, you can read the documentation
<a href="../doxygen/main/index.html">here</a>
</p>
<p>
Gallium is also documented using Sphinx. The generated output can be found
<a href="https://gallium.readthedocs.io">on Gallium.ReadTheDocs.io</a>.
</p>
</div>
</body>
</html>

View File

@@ -31,7 +31,7 @@ each directory.
<ul>
<li><b>glsl</b> - the GLSL IR and compiler
<li><b>nir</b> - the NIR IR and compiler
<li><b>spriv</b> - the SPIR-V compiler
<li><b>spirv</b> - the SPIR-V compiler
</ul>
<li><b>egl</b> - EGL library sources
<ul>
@@ -145,7 +145,7 @@ each directory.
<li><b>xvmc</b> - XvMC state tracker
<li><b>vdpau</b> - VDPAU state tracker
<li><b>va</b> - VA-API state tracker
<li><b>omx</b> - OpenMAX state tracker
<li><b>omx_bellagio</b> - OpenMAX Bellagio state tracker
</ul>
<li><b>winsys</b> -
<ul>

View File

@@ -0,0 +1,98 @@
Name
MESA_drm_image_formats
Name Strings
EGL_MESA_drm_image_formats
Contributors
Nicolai Hähnle <Nicolai.Haehnle@amd.com>
Qiang Yu <Qiang.Yu@amd.com>
Contact
Nicolai Hähnle <Nicolai.Haehnle@amd.com>
Status
Proposal
Version
Version 1, January 26, 2017
Number
EGL Extension #??
Dependencies
This extension requires the EGL_MESA_drm_image extension.
This extension is written against the wording of EGL_MESA_drm_image
specification.
Overview
This extension extends the functionality of EGL_MESA_drm_image by adding
additional formats required by Glamor for use with DRM buffers.
IP Status
Open-source; freely implementable.
New Procedures and Functions
None
New Tokens
Accepted as values for the EGL_IMAGE_FORMAT_MESA attribute:
EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA 0x3290
EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA 0x3291
EGL_DRM_BUFFER_FORMAT_RGB565_MESA 0x3292
Additions to the EGL_MESA_drm_image Specification:
Remove the sentence "The only format specified ..." from the paragraph
describing eglCreateDRMImageMESA and add the following paragraph:
The formats specified for use with EGL_DRM_BUFFER_FORMAT_MESA are:
* EGL_DRM_BUFFER_FORMAT_ARGB32_MESA, where each pixel is a CPU-endian
32-bit quantity, with alpha in the upper 8 bits, then red, then green,
then blue,
* EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA, where each pixel is a CPU-
endian, 32-bit quantity, with alpha in the most significant 2 bits,
followed by 10 bits each for red, green, and blue,
* EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA, where each pixel is a CPU-endian
16-bit quantity, with alpha in the most significant bit, followed by
5 bits each for red, green, and blue, and
* EGL_DRM_BUFFER_FORMAT_RGB565_MESA, where each pixel is a CPU-endian
16-bit quantity, with red in the 5 most significant bits, followed by
6 bits of green and 5 bits of blue.
Issues
1. Should we expose the full set of channel permutations for the formats,
e.g. ABGR2101010, RGBA1010102, and BGRA1010102 in addition to
ARGB2101010?
RESOLVED: No.
DISCUSSION: The original extension sets a precedent of only exposing one
of the possible permutations of 8-bit channel formats. It is also not
clear where the additional permutations would be used. For example,
Glamor has a fixed mapping from pixmap/screen depth to format that
doesn't allow for the other permutations.
Revision History
Version 1, January, 2017
Initial draft (Nicolai Hähnle)

View File

@@ -20,11 +20,11 @@ Status
Version
Version 2, July 7, 2016
Version 3, March 31, 2017
Number
TBD
OpenGL Extension #495
Dependencies
@@ -34,7 +34,7 @@ Dependencies
This extension is written against Version 1.50 (Revision 09) of the OpenGL
Shading Language Specification.
GLSL 1.30 is required.
GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
This extension interacts with ARB_gpu_shader5.
@@ -51,9 +51,10 @@ Overview
calculations).
This extension provides a set of new features to the OpenGL Shading
Language to support capabilities of these GPUs, extending the capabilities
of version 1.30 of the OpenGL Shading Language. Shaders
using the new functionality provided by this extension should enable this
Language to support capabilities of these GPUs, extending the
capabilities of version 1.30 of the OpenGL Shading Language and version
3.00 of the OpenGL ES Shading Language. Shaders using the new
functionality provided by this extension should enable this
functionality via the construct
#extension GL_MESA_shader_integer_functions : require (or enable)
@@ -516,5 +517,6 @@ Revision History
Rev. Date Author Changes
---- ----------- -------- -----------------------------------------
3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3)
2 7-Jul-2016 idr Fix typo in #extension line
1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5.

View File

@@ -76,9 +76,9 @@ Overview
References:
http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011557
http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=000516
http://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011903
https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011557
https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=000516
https://www.opengl.org/discussion_boards/cgi_directory/ultimatebb.cgi?ubb=get_topic;f=3;t=011903
http://www.delphi3d.net/articles/viewarticle.php?article=terraintex.htm
New Procedures and Functions

View File

@@ -133,7 +133,7 @@ New Tokens
GetFloatv and GetIntegerv:
FRAGMENT_PROGRAM_POSITION_MESA 0x8bb0
VERTEX_PROGRAM_POSITION_MESA 0x8bb4
VERTEX_PROGRAM_POSITION_MESA 0x8bb5
Accepted by the <pname> parameter of GetPointerv:

View File

@@ -1,10 +1,10 @@
The definitive source for enum values and reserved ranges are the XML files in
the Khronos registry:
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/egl.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/gl.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/glx.xml
https://cvs.khronos.org/svn/repos/ogl/trunk/doc/registry/public/api/wgl.xml
https://github.com/KhronosGroup/EGL-Registry/blob/master/api/egl.xml
https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/gl.xml
https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/glx.xml
https://github.com/KhronosGroup/OpenGL-Registry/blob/master/xml/wgl.xml
GL blocks allocated to Mesa:
0x8750-0x875F
@@ -63,6 +63,21 @@ GL_MESAX_texture_stack:
GL_TEXTURE_1D_STACK_BINDING_MESAX 0x875D
GL_TEXTURE_2D_STACK_BINDING_MESAX 0x875E
GL_MESA_program_debug
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x8BB0
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x8BB1
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x8BB2
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x8BB3
GL_FRAGMENT_PROGRAM_POSITION_MESA 0x8BB4
GL_FRAGMENT_PROGRAM_CALLBACK_MESA 0x8BB5
GL_FRAGMENT_PROGRAM_CALLBACK_FUNC_MESA 0x8BB6
GL_FRAGMENT_PROGRAM_CALLBACK_DATA_MESA 0x8BB7
GL_MESA_tile_raster_order
GL_TILE_RASTER_ORDER_FIXED_MESA 0x8BB8
GL_TILE_RASTER_ORDER_INCREASING_X_MESA 0x8BB9
GL_TILE_RASTER_ORDER_INCREASING_Y_MESA 0x8BBA
EGL_MESA_drm_image
EGL_DRM_BUFFER_FORMAT_MESA 0x31D0
EGL_DRM_BUFFER_USE_MESA 0x31D1
@@ -76,6 +91,11 @@ EGL_MESA_platform_gbm
EGL_MESA_platform_surfaceless
EGL_PLATFORM_SURFACELESS_MESA 0x31DD
EGL_MESA_drm_image
EGL_DRM_BUFFER_FORMAT_ARGB2101010_MESA 0x3290
EGL_DRM_BUFFER_FORMAT_ARGB1555_MESA 0x3291
EGL_DRM_BUFFER_FORMAT_RGB565_MESA 0x3292
EGL_WL_bind_wayland_display
EGL_TEXTURE_FORMAT 0x3080
EGL_WAYLAND_BUFFER_WL 0x31D5

View File

@@ -25,6 +25,7 @@
<li><a href="#reviewing">Reviewing Patches</a>
<li><a href="#nominations">Nominating a commit for a stable branch</a>
<li><a href="#criteria">Criteria for accepting patches to the stable branch</a>
<li><a href="#backports">Sending backports for the stable branch</a>
<li><a href="#gittips">Git tips</a>
</ul>
@@ -72,11 +73,16 @@ if needed. For example:
platform.
</pre>
<li>A "Signed-off-by:" line is not required, but not discouraged either.
<li>If a patch address a bugzilla issue, that should be noted in the
<li>If a patch addresses a bugzilla issue, that should be noted in the
patch comment. For example:
<pre>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89689
</pre>
<li>If a patch addresses a issue introduced with earlier commit, that should be
noted in the patch comment. For example:
<pre>
Fixes: d7b3707c612 "util/disk_cache: use stat() to check if entry is a directory"
</pre>
<li>If there have been several revisions to a patch during the review
process, they should be noted such as in this example:
<pre>
@@ -114,7 +120,7 @@ them in the CC list.
Please use common sense and do <strong>not</strong> blindly add everyone.
<br>
<pre>
$ scripts/get_reviewer.pl --help # to get the the help screen
$ scripts/get_reviewer.pl --help # to get the help screen
$ scripts/get_reviewer.pl -f src/egl/drivers/dri2/platform_android.c
Rob Herring <robh@kernel.org> (reviewer:ANDROID EGL SUPPORT,added_lines:188/700=27%,removed_lines:58/283=20%)
Tomasz Figa <tfiga@chromium.org> (reviewer:ANDROID EGL SUPPORT,authored:12/41=29%,added_lines:308/700=44%,removed_lines:115/283=41%)
@@ -140,11 +146,23 @@ to update the tests themselves.
<p>
Whenever possible and applicable, test the patch with
<a href="http://piglit.freedesktop.org">Piglit</a> and/or
<a href="https://piglit.freedesktop.org">Piglit</a> and/or
<a href="https://android.googlesource.com/platform/external/deqp/">dEQP</a>
to check for regressions.
</p>
<p>
As mentioned at the begining, patches should be bisectable.
A good way to test this is to make use of the `git rebase` command,
to run your tests on each commit. Assuming your branch is based off
<code>origin/master</code>, you can run:
<pre>
$ git rebase --interactive --exec "make check" origin/master
</pre>
replacing <code>"make check"</code> with whatever other test you want to
run.
</p>
<h2 id="mailing">Mailing Patches</h2>
@@ -173,6 +191,16 @@ When submitting follow-up patches you should also login to
state of your old patches to Superseded.
</p>
<p>
Some companies' mail server automatically append a legal disclaimer,
usually containing something along the lines of "The information in this
email is confidential" and "distribution is strictly prohibited".<br/>
These legal notices prevent us from being able to accept your patch,
rendering the whole process pointless. Please make sure these are
disabled before sending your patches. (Note that you may need to contact
your email administrator for this.)
</p>
<h2 id="reviewing">Reviewing Patches</h2>
<p>
@@ -205,7 +233,7 @@ as the issues are resolved first.
<h2 id="nominations">Nominating a commit for a stable branch</h2>
<p>
There are three ways to nominate patch for inclusion of the stable branch and
There are three ways to nominate a patch for inclusion in the stable branch and
release.
</p>
<ul>
@@ -232,22 +260,16 @@ Here are some examples of such a note:
</p>
<ul>
<li>CC: &lt;mesa-stable@lists.freedesktop.org&gt;</li>
<li>CC: "9.2 10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>
<li>CC: "10.0" &lt;mesa-stable@lists.freedesktop.org&gt;</li>
</ul>
Simply adding the CC to the mesa-stable list address is adequate to nominate
the commit for the most-recently-created stable branch. It is only necessary
to specify a specific branch name, (such as "9.2 10.0" or "10.0" in the
examples above), if you want to nominate the commit for an older stable
branch. And, as in these examples, you can nominate the commit for the older
branch in addition to the more recent branch, or nominate the commit
exclusively for the older branch.
the commit for all the active stable branches. If the commit is not applicable
for said branch the stable-release manager will reply stating so.
This "CC" syntax for patch nomination will cause patches to automatically be
copied to the mesa-stable@ mailing list when you use "git send-email" to send
patches to the mesa-dev@ mailing list. If you prefer using --suppress-cc that
won't have any effect negative effect on the patch nomination.
won't have any negative effect on the patch nomination.
<p>
Note: by removing the tag [as the commit is pushed] the patch is
@@ -256,18 +278,60 @@ Note: by removing the tag [as the commit is pushed] the patch is
Thus, drop the line <strong>only</strong> if you want to cancel the nomination.
</p>
Alternatively, if one uses the "Fixes" tag as described in the "Patch formatting"
section, it nominates a commit for all active stable branches that include the
commit that is referred to.
<h2 id="criteria">Criteria for accepting patches to the stable branch</h2>
Mesa has a designated release manager for each stable branch, and the release
manager is the only developer that should be pushing changes to these
branches. Everyone else should simply nominate patches using the mechanism
described above.
manager is the only developer that should be pushing changes to these branches.
Everyone else should nominate patches using the mechanism described above.
The stable-release manager will work with the list of nominated patches, and
for each patch that meets the criteria below will cherry-pick the patch with:
<code>git cherry-pick -x &lt;commit&gt;</code>. The <code>-x</code> option is
important so that the picked patch references the commit ID of the original
patch.
The following rules define which patches are accepted and which are not. The
stable-release manager is also given broad discretion in rejecting patches
that have been nominated.
<ul>
<li>Patch must conform with the <a href="#guidelines">Basic guidelines</a></li>
<li>Patch must have landed in master first. In case where the original
patch is too large and/or otherwise contradicts with the rules set within, a
backport is appropriate.</li>
<li>It must not introduce a regression - be that build or runtime wise.
Note: If the regression is due to faulty piglit/dEQP/CTS/other test the
latter must be fixed first. A reference to the offending test(s) and
respective fix(es) should be provided in the nominated patch.</li>
<li>Patch cannot be larger than 100 lines.</li>
<li>Patches that move code around with no functional change should be
rejected.</li>
<li>Patch must be a bug fix and not a new feature.
Note: An exception to this rule, are hardware-enabling "features". For
example, <a href="#backports">backports</a> of new code to support a
newly-developed hardware product can be accepted if they can be reasonably
determined not to have effects on other hardware.</li>
<li>Patch must be reviewed, For example, the commit message has Reviewed-by,
Signed-off-by, or Tested-by tags from someone but the author.</li>
<li>Performance patches are considered only if they provide information
about the hardware, program in question and observed improvement. Use numbers
to represent your measurements.</li>
</ul>
If the patch complies with the rules it will be
<a href="releasing.html#pickntest">cherry-picked</a>. Alternatively the release
manager will reply to the patch in question stating why the patch has been
rejected or would request a backport.
A summary of all the picked/rejected patches will be presented in the
<a href="releasing.html#prerelease">pre-release</a> announcement.
The stable-release manager may at times need to force-push changes to the
stable branches, for example, to drop a previously-picked patch that was later
@@ -275,71 +339,15 @@ identified as causing a regression). These force-pushes may cause changes to
be lost from the stable branch if developers push things directly. Consider
yourself warned.
The stable-release manager is also given broad discretion in rejecting patches
that have been nominated for the stable branch. The most basic rule is that
the stable branch is for bug fixes only, (no new features, no
regressions). Here is a non-exhaustive list of some reasons that a patch may
be rejected:
<ul>
<li>Patch introduces a regression. Any reported build breakage or other
regression caused by a particular patch, (game no longer work, piglit test
changes from PASS to FAIL), is justification for rejecting a patch.</li>
<li>Patch is too large, (say, larger than 100 lines)</li>
<li>Patch is not a fix. For example, a commit that moves code around with no
functional change should be rejected.</li>
<li>Patch fix is not clearly described. For example, a commit message
of only a single line, no description of the bug, no mention of bugzilla,
etc.</li>
<li>Patch has not obviously been reviewed, For example, the commit message
has no Reviewed-by, Signed-off-by, nor Tested-by tags from anyone but the
author.</li>
<li>Patch has not already been merged to the master branch. As a rule, bug
fixes should never be applied first to a stable branch. Patches should land
first on the master branch and then be cherry-picked to a stable
branch. (This is to avoid future releases causing regressions if the patch
is not also applied to master.) The only things that might look like
exceptions would be backports of patches from master that happen to look
significantly different.</li>
<li>Patch depends on too many other patches. Ideally, all stable-branch
patches should be self-contained. It sometimes occurs that a single, logical
bug-fix occurs as two separate patches on master, (such as an original
patch, then a subsequent fix-up to that patch). In such a case, these two
patches should be squashed into a single, self-contained patch for the
stable branch. (Of course, if the squashing makes the patch too large, then
that could be a reason to reject the patch.)</li>
<li>Patch includes new feature development, not bug fixes. New OpenGL
features, extensions, etc. should be applied to Mesa master and included in
the next major release. Stable releases are intended only for bug fixes.
Note: As an exception to this rule, the stable-release manager may accept
hardware-enabling "features". For example, backports of new code to support
a newly-developed hardware product can be accepted if they can be reasonably
determined to not have effects on other hardware.</li>
<li>Patch is a performance optimization. As a rule, performance patches are
not candidates for the stable branch. The only exception might be a case
where an application's performance was recently severely impacted so as to
become unusable. The fix for this performance regression could then be
considered for a stable branch. The optimization must also be
non-controversial and the patches still need to meet the other criteria of
being simple and self-contained</li>
<li>Patch introduces a new failure mode (such as an assert). While the new
assert might technically be correct, for example to make Mesa more
conformant, this is not the kind of "bug fix" we want in a stable
release. The potential problem here is that an OpenGL program that was
previously working, (even if technically non-compliant with the
specification), could stop working after this patch. So that would be a
regression that is unacceptable for the stable branch.</li>
</ul>
<h2 id="backports">Sending backports for the stable branch</h2>
By default merge conflicts are resolved by the stable-release manager. In which
case he/she should provide a comment about the changes required, alongside the
<code>Conflicts</code> section. Summary of which will be provided in the
<a href="releasing.html#prerelease">pre-release</a> announcement.
<br>
Developers are interested in sending backports are recommended to use either a
<code>[BACKPORT #branch]</code> subject prefix or provides similar information
within the commit summary.
<h2 id="gittips">Git tips</h2>

View File

@@ -36,10 +36,10 @@ Hardware drivers include:
<li>Intel i965, i945, i915.
See <a href="https://01.org/linuxgraphics">Intel's website</a></li>
<li>AMD Radeon series.
See <a href="http://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>
See <a href="https://www.x.org/wiki/RadeonFeature">RadeonFeature</a></li>
<li>NVIDIA GPUs.
See <a href="http://nouveau.freedesktop.org">Nouveau Wiki</a></li>
<li><a href="http://www.x.org/wiki/vmware">VMware virtual GPU</a></li>
See <a href="https://nouveau.freedesktop.org">Nouveau Wiki</a></li>
<li><a href="https://www.x.org/wiki/vmware">VMware virtual GPU</a></li>
</ul>
<p>
@@ -57,7 +57,7 @@ Additional driver information:
</p>
<ul>
<li><a href="http://dri.freedesktop.org/"> DRI hardware
<li><a href="https://dri.freedesktop.org/"> DRI hardware
drivers</a> for the X Window System
<li><a href="xlibdriver.html">Xlib / swrast driver</a> for the X Window System
and Unix-like operating systems

View File

@@ -24,7 +24,7 @@ This list is far from complete and somewhat dated, unfortunately.
<ul>
<li>Early Mesa development was done while Brian was part of the
<a href="http://www.ssec.wisc.edu/~billh/vis.html">
<a href="https://www.ssec.wisc.edu/~billh/vis.html">
SSEC Visualization Project</a> at the University of
Wisconsin. He'd like to thank Bill Hibbard for letting him work on
Mesa as part of that project.
@@ -40,14 +40,9 @@ Tungsten Graphics, Inc. have supported the ongoing development of Mesa.
<br>
<br>
<li>The
<a href="http://www.mesa3d.org">Mesa</a>
website is hosted by
<a href="http://sourceforge.net">sourceforge.net</a>.
<br>
<br>
<li>The Mesa git repository is hosted by
<a href="http://freedesktop.org/">freedesktop.org</a>.
<a href="https://www.mesa3d.org">Mesa</a>
website and git repository are hosted by
<a href="https://freedesktop.org/">freedesktop.org</a>.
<br>
<br>

View File

@@ -17,11 +17,11 @@
<h1>Development Utilities</h1>
<dl>
<dt><a href="http://cgit.freedesktop.org/mesa/demos">Mesa demos collection</a></dt>
<dt><a href="https://cgit.freedesktop.org/mesa/demos">Mesa demos collection</a></dt>
<dd>includes several utility routines in the <code>src/util/</code>
directory.</dd>
<dt><a href="http://piglit.freedesktop.org">Piglit</a></dt>
<dt><a href="https://piglit.freedesktop.org">Piglit</a></dt>
<dd>is an open-source test suite for OpenGL implementations.</dd>
<dt><a href="https://github.com/apitrace/apitrace">ApiTrace</a></dt>
@@ -31,7 +31,7 @@
<dd>is a very useful tool for tracking down
memory-related problems in your code.</dd>
<dt><a href="http://scan.coverity.com/projects/mesa">Coverity</a><dt>
<dt><a href="https://scan.coverity.com/projects/mesa">Coverity</a><dt>
<dd>provides static code analysis of Mesa. If you create an account
you can see the results and try to fix outstanding issues.</dd>
</dl>

View File

@@ -18,7 +18,7 @@
<p>
This page lists known issues with
<a href="http://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>
<a href="https://www.spec.org/gwpg/gpc.static/vp11info.html" target="_main">SPEC Viewperf 11</a>
and <a href="https://www.spec.org/gwpg/gpc.static/vp12info.html" target="_main">SPEC Viewperf 12</a>
when running on Mesa-based drivers.
</p>
@@ -66,10 +66,10 @@ either in Viewperf or the Mesa driver.
<p>
These tests use features of the
<a href="http://www.opengl.org/registry/specs/NV/fragment_program2.txt"
<a href="https://www.opengl.org/registry/specs/NV/fragment_program2.txt"
target="_main">
GL_NV_fragment_program2</a> and
<a href="http://www.opengl.org/registry/specs/NV/vertex_program3.txt"
<a href="https://www.opengl.org/registry/specs/NV/vertex_program3.txt"
target="_main">
GL_NV_vertex_program3</a> extensions without checking if the driver supports
them.
@@ -86,7 +86,7 @@ Subsequent drawing calls become no-ops and the rendering is incorrect.
<p>
These tests depend on the
<a href="http://www.opengl.org/registry/specs/NV/primitive_restart.txt"
<a href="https://www.opengl.org/registry/specs/NV/primitive_restart.txt"
target="_main">GL_NV_primitive_restart</a> extension.
</p>

View File

@@ -18,7 +18,7 @@
<p>
This page describes how to build, install and use the
<a href="http://www.vmware.com/">VMware</a> guest GL driver
<a href="https://www.vmware.com/">VMware</a> guest GL driver
(aka the SVGA or SVGA3D driver) for Linux using the latest source code.
This driver gives a Linux virtual machine access to the host's GPU for
hardware-accelerated 3D.
@@ -62,9 +62,9 @@ these instructions explain what to do.
For more information about the X components see these wiki pages at x.org:
</p>
<ul>
<li><a href="http://wiki.x.org/wiki/vmware">
<li><a href="https://wiki.x.org/wiki/vmware">
Driver Overview</a>
<li><a href="http://wiki.x.org/wiki/vmware/vmware3D">
<li><a href="https://wiki.x.org/wiki/vmware/vmware3D">
xf86-video-vmware Details</a>
</ul>
@@ -82,8 +82,8 @@ The components involved in this include:
<p>
All of these components reside in the guest Linux virtual machine.
On the host, all you're doing is running VMware
<a href="http://www.vmware.com/products/workstation/">Workstation</a> or
<a href="http://www.vmware.com/products/fusion/">Fusion</a>.
<a href="https://www.vmware.com/products/workstation/">Workstation</a> or
<a href="https://www.vmware.com/products/fusion/">Fusion</a>.
</p>

Some files were not shown because too many files have changed in this diff Show More